IQ4_NL

#3
by Mushoz - opened

Any chance for getting a IQ4_NL quant as well?

I am benchmarking different quants of this particular model, and would love to test that one as well. My findings so far can be found here: https://www.reddit.com/r/LocalLLaMA/comments/1gajy1j/aider_optimizing_performance_at_24gb_vram_with/

Thanks for your great work!

I haven't tended to make them because it's basically identical to IQ4_XS with a larger size, but maybe I can start releasing some..

Hmmm, I wasn't aware they perform similarly to the IQ4_XS quant. I thought they were a bit better. If you do decide to release some, if you could release them for this specific model I would be more than happy to run Aider's benchmark to see how it stacks up versus the other quants. Thanks for all your work on these quants!

yeah it's based on this chart: https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9

but i'll throw one up here so you can test it and provide more data :)

Awesome, much appreciated!

Added it @Mushoz

Awesome! I will benchmark it as soon as I am able to run it. Currently running into an issue with Ollama not recognizing the quant, so unable to pull it unfortunately : /
Have left a comment on the issue tracker of Ollama here: https://github.com/ollama/ollama/issues/7268#issuecomment-2438490194

Sign up or log in to comment