Request for Quants

#1
by ShatteredManoj - opened

Please could u quantize the "noctrex/Qwopus3.5-9B-Coder-MTP model" even more like iq4nl or even 3 bit quants, it would help me a lot to run on a 6gb vram laptop

Of course. Give me a minute to cook them up.

Produced some smaller quants, but keep in mind for such a smaller 9B model, the quality will suffer.
Maybe you could try out the larger one, 35B-A3B, with offloading

Thanks a lot

i think 35B-A3B, with offloading will put lot of stress on the system both cpu and gpu with noticable slow computation speed loss
low prompt processing and token/sec is not good right and even if i want to i have to use 2 bit quants so less quality,
would love to hear your thoughts and if i am wrong suggest me a model with quant to use in any way

Well, it depends on how much system RAM do you have. If you can spare 32GB, then you can load a very good quant of it.

Sign up or log in to comment