Seems like the GPTQ versions are broken

#2
by NePe - opened

for the bigger models i get:
RuntimeError: cannot reshape tensor of 0 elements into shape [-1, 1, 0] because the unspecified dimension size -1 can be any value and is ambiguous in self.gate...

for this test one i get:

...
 File "/home/nepe/.local/lib/python3.10/site-packages/transformers/models/mixtral/modeling_mixtral.py", line 708, in forward
    router_logits = self.gate(hidden_states)
  File "/home/nepe/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/nepe/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/nepe/.local/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
    output = module._old_forward(*args, **kwargs)
  File "/home/nepe/.local/lib/python3.10/site-packages/auto_gptq/nn_modules/qlinear/qlinear_cuda.py", line 227, in forward
    zeros = zeros.reshape(self.scales.shape)
RuntimeError: shape '[8, 8]' is invalid for input of size 0

The non GPTQ version of the test model works perfectly.

TheBlokeAI org

Yeah, see the READMEs of the proper GPTQs for how to load them - you still need an AutoGPTQ PR at the moment

I tried both the old and the fix branches, same error. I even tried to quantize this model, same error.

As far as i understand there's still some more things to do.
Based on this:
https://github.com/PanQiWei/AutoGPTQ/pull/480
Have to apply this:
https://github.com/huggingface/transformers/pull/27956
And maybe this one too:
https://github.com/huggingface/optimum/pull/1585

My mistake, tried it with AutoModelForCausalLM.from_pretrained instead of AutoGPTQForCausalLM.from_quantized

NePe changed discussion status to closed

Sign up or log in to comment