Q3_K_M quantization

#1
by Nexesenex - opened

Hello, Rachid.
Thanks for this, it looks promising. Could you please, if you have the time, quantize and publish the Q3_K_M, which seems the best compromise between a perplexity close to the 30b models we are used to have now and the size/speed benefits granted by q3_0 ?
Cheers!

You're welcome. I have completed the quantization recently, and the Q3_K_M is now available at the following link: https://huggingface.co/RachidAR/WizardLM-Uncensored-SCOT-ST-30B-Q3_K_M-GGML

Awesome job.
I'm grabbing it right now. Thank you very much, Rachid!

Sign up or log in to comment