GPTQ/AWQ quant that is runable in vllm?

#4
by Light4Bear - opened

@LoneStriker can you please make a GPTQ or AWQ 4bit 128g quant of this?

I do not believe my machines have enough resources to generate GPTQ or AWQ versions of this model unfortunately. If I get access to a bigger box, I'll add these to the list.

@LoneStriker I tried to do gptq quant myself, but my 22G vram card came a few GBs short, oomed at the 52th/80 layer. So I thought a 24G card might be just enough. Anyway, @titan087 has made it https://huggingface.co/titan087/Liberated-Qwen1.5-72B-4bit thanks.

Sign up or log in to comment