Error(s) in loading state_dict for LlamaForCausalLM

#1
by noprompt - opened

After running

 python -m fastchat.serve.cli --model-path LLaMa/airoboros-33B-gpt4-1.2-GPTQ --gptq-ckpt LaMa/airoboros-33B-gpt4-1.2-GPTQ/airoboros-33b-gpt4-1.2-GPTQ-4bit--1g.act.order.safetensors --gptq-wbits 4 --gptq-groupsize 128

I got this nasty error

RuntimeError: Error(s) in loading state_dict for LlamaForCausalLM:
        size mismatch for model.layers.0.self_attn.k_proj.qzeros: copying a param with shape torch.Size([1, 832]) from checkpoint, the shape in current model is torch.Size([52, 832]).
        size mismatch for model.layers.0.self_attn.k_proj.scales: copying a param with shape torch.Size([1, 6656]) from checkpoint, the shape in current model is torch.Size([52, 6656]).
        size mismatch for model.layers.0.self_attn.o_proj.qzeros: copying a param with shape torch.Size([1, 832]) from checkpoint, the shape in current model is torch.Size([52, 832]).
        size mismatch for model.layers.0.self_attn.o_proj.scales: copying a param with shape torch.Size([1, 6656]) from checkpoint, the shape in current model is 
...
        size mismatch for model.layers.59.mlp.gate_proj.scales: copying a param with shape torch.Size([1, 17920]) from checkpoint, the shape in current model is torch.Size([52, 17920]).
        size mismatch for model.layers.59.mlp.up_proj.qzeros: copying a param with shape torch.Size([1, 2240]) from checkpoint, the shape in current model is torch.Size([52, 2240]).
        size mismatch for model.layers.59.mlp.up_proj.scales: copying a param with shape torch.Size([1, 17920]) from checkpoint, the shape in current model is torch.Size([52, 17920]).

I believe this is because FastChat must be assuming the wrong group_size for the model.

This model has group_size = -1, meaning no group_size.

I have never tested with FastChat or its GPTQ-for-LLaMa implementation so I can't yet provide support for that. But see if there's a way to specify the group_size, and then specify it as -1.

Hey, thanks for your help! I really appreciate it. It looks like dropping the gptq-groupsize flag works too.

Sign up or log in to comment