GPTQ plz

by xuchen123 - opened

GPTQ plz.

It is a big model, I can see why that'd be a good idea.

Hi all,
If @puffy310 hasn't started, I can give it a shot. (assuming DeepseekV2ForCausalLM is supported by now in AutoGPTQ)

Try the vLLM version first, as the model devs have said the Huggingface implementation isn't up to their standards anyways. "Everyone wants a quantized model but nobody wants to quantize a model". - Julian Herrera
I'll see if I can give it a try but I doubt I have the know how. DeepseekV2 was just released and I don't know if AutoGPTQ works well with MoE architectures. If I have some time today I might as well try but your implementation will most likely be better. I always love to learn though. I'll write progress in this discussion.

I try building the model by awq. It takes a long time to rebulid the model.

Just for a reference:

Seems not feasible in AutoAWQ as well:

AutoAWQ and GPTQModel support this model

xuchen123 changed discussion status to closed
xuchen123 changed discussion status to open

Sign up or log in to comment