GPTQ plz

#3
by Parkerlambert123 - opened

It is a big model, I can see why that'd be a good idea.

Hi all,
If @puffy310 hasn't started, I can give it a shot. (assuming DeepseekV2ForCausalLM is supported by now in AutoGPTQ)

Try the vLLM version first, as the model devs have said the Huggingface implementation isn't up to their standards anyways. "Everyone wants a quantized model but nobody wants to quantize a model". - Julian Herrera
I'll see if I can give it a try but I doubt I have the know how. DeepseekV2 was just released and I don't know if AutoGPTQ works well with MoE architectures. If I have some time today I might as well try but your implementation will most likely be better. I always love to learn though. I'll write progress in this discussion.

I try building the model by awq. It takes a long time to rebulid the model.

Just for a reference: https://github.com/AutoGPTQ/AutoGPTQ/issues/664

Seems not feasible in AutoAWQ as well: https://github.com/casper-hansen/AutoAWQ/issues/473

@MaziyarPanahi
AutoAWQ and GPTQModel support this model

Parkerlambert123 changed discussion status to closed
Parkerlambert123 changed discussion status to open

Sign up or log in to comment