IQ4_NL

by Mushoz - opened Oct 25, 2024

Oct 25, 2024

Any chance for getting a IQ4_NL quant as well?

I am benchmarking different quants of this particular model, and would love to test that one as well. My findings so far can be found here: https://www.reddit.com/r/LocalLLaMA/comments/1gajy1j/aider_optimizing_performance_at_24gb_vram_with/

Thanks for your great work!

bartowski

Owner Oct 25, 2024

I haven't tended to make them because it's basically identical to IQ4_XS with a larger size, but maybe I can start releasing some..

Mushoz

Oct 25, 2024

Hmmm, I wasn't aware they perform similarly to the IQ4_XS quant. I thought they were a bit better. If you do decide to release some, if you could release them for this specific model I would be more than happy to run Aider's benchmark to see how it stacks up versus the other quants. Thanks for all your work on these quants!

bartowski

Owner Oct 25, 2024

yeah it's based on this chart: https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9

but i'll throw one up here so you can test it and provide more data :)

Mushoz

Oct 25, 2024

Awesome, much appreciated!

bartowski

Owner Oct 25, 2024

Added it @Mushoz

Mushoz

Oct 25, 2024

Awesome! I will benchmark it as soon as I am able to run it. Currently running into an issue with Ollama not recognizing the quant, so unable to pull it unfortunately : /
Have left a comment on the issue tracker of Ollama here: https://github.com/ollama/ollama/issues/7268#issuecomment-2438490194

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment