Can you quantify the model?

#1
by xldistance - opened

Quants uploading here: https://huggingface.co/models?search=LoneStriker%20Smaugv0.1

Thank you very much.

And what about AWQ or GPTQ ?

And what about AWQ or GPTQ ?

I mostly do exl2 and sometimes GGUF quants (I started out doing quants that I was using myself and it's grown a bit since then.) You'll have to wait for TheBloke to get the GPTQ and AWQ quants as I have not setup those pipelines myself. exl2 quants tend to be the fastest ones for inference if you have a GPU where the quantized model will fit. GGUF quants are the most compatible across the widest ranges of devices.

Sign up or log in to comment