Perplexity
#22
by
gsaivinay
- opened
Hello Mr. Bloke,
I assume you must be busy, but I'd like to ask if you have made any performance comparisons for Llama2 70B chat models vs GPTQ quant, like you did here . I'm just curious to know what is the performance of bigger quantized models vs fp16.
No worries if this is not in your radar.