Error: size mismatch for model.layers.0.self_attn.q_proj.weight:

#6
by tanliboy - opened

I am running into the below error while fine-tuning this model.

[rank0]: RuntimeError: Error(s) in loading state_dict for MistralForCausalLM:
[rank0]:        size mismatch for model.layers.0.self_attn.q_proj.weight: copying a param with shape torch.Size([4096, 5120]) from checkpoint, the shape in current model is torch.Size([5120, 5120]).

Did I miss something?

The reason is your transformers version is not correct.
pip install git+https://github.com/huggingface/transformers.git is OK.

Thank you, @popo20231015 !
It seems that it is fixed in the master but not in the recent release.

Sign up or log in to comment