Benchmark of different GGML version

by aiapprentice101 - opened Jul 18, 2023

Jul 18, 2023

Hi,

Awesome work! Do you happen to have any benchmark of different versions of the GGML model? It will be great to see how much performance deterioration we get from these quantization techniques.

mike-ravkine

Jul 20, 2023

@aiapprentice101 Did you see https://oobabooga.github.io/blog/posts/perplexities/ ? Its not about this model specifically but covers the relative performance of different GGML and GPTQ quants.

aiapprentice101

Jul 23, 2023

Thank you @mike-ravkine . The post seems to claim GPTQ is the best in terms of quality. However, when I test out the GPTQ version of Llama-2 (also from TheBloke), I get very bad performance. The model doesn't capture the one-shot instructions, and generates random stuffs.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment