Perplexity

#22

by gsaivinay - opened Jul 26, 2023

Discussion

gsaivinay

Jul 26, 2023

•

edited Jul 26, 2023

Hello Mr. Bloke,

I assume you must be busy, but I'd like to ask if you have made any performance comparisons for Llama2 70B chat models vs GPTQ quant, like you did here . I'm just curious to know what is the performance of bigger quantized models vs fp16.

No worries if this is not in your radar.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment