Size mismatch
The same error happens like this https://github.com/huggingface/autotrain-advanced/issues/487 when i'm trying to merge my adapter (finetuned LoRA model - https://huggingface.co/neighborwang/ModeliCo-7B) into the base model (Qwen2.5-Coder-7B-Instruct ).
RuntimeError: Error(s) in loading state_dict for PeftModelForCausalLM:
size mismatch for base_model.model.model.embed_tokens.weight:
copying a param with shape torch.Size([151665, 3584]) from checkpoint,
the shape in current model is torch.Size([152064, 3584]).
I faced the same issue with Llama 3.1 but i solved it use specific transformers
version, so I tried for my adapter and Qwen2.5-Coder-7B-Instruct
the following transformers
versions:
v4.45.1
v4.45.0
v4.44.0
v4.43.0
v4.37.0
But nothing works... I need some help. Also in the GitHub issue I mentioned above, other people are also facing this issue.
Thanks a lot in advance!
Hi, it appeared the the embed_tokens and the lm_head had different shapes from the base model. Please try padding the tensors from the adapter model manually or truncating the tensors from the base model.
FYI: the size of the vocabulary (151665) is different from the size of the embed_tokens and the lm_head (depending on the size of the model, for 7B, it is 152064, the vocab_size in the config.json). Normally, the size from the config.json is used.
Hi Xuancheng, thanks a lot for your answer!
I used AutoTrain, I think with that the parameters are supposed to be matched automatically. Is this a problem with AutoTrain? Or what was the reason of this mismatch?
Thx!
when using the merge_adapter
param to true
in AutoTrain, the model seems to be merging fine.
Ok, I will try, thank you very much Abhishek!