Issue with RuntimeError when Loading Fine-Tuned qwen2-1.5-instruct Model

#4
by Ganz00 - opened

Hi everyone,

I recently fine-tuned the qwen2-1.5-instruct model with autotrain for my project. However, I'm encountering an issue when trying to load the fine-tuned model. The error message I'm seeing is as follows:

RuntimeError: Error(s) in loading state_dict for Qwen2ForCausalLM:
size mismatch for model.embed_tokens.weight: copying a param with shape torch.Size([151646, 1536]) from checkpoint, the shape in current model is torch.Size([151936, 1536]).
size mismatch for lm_head.weight: copying a param with shape torch.Size([151646, 1536]) from checkpoint, the shape in current model is torch.Size([151936, 1536]).

Here's a simplified version of my code:

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
import time

model_path = "Ganz00/autotrain-Qwen2-1-5B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(
model_path,
device_map="auto",
torch_dtype='auto',
ignore_mismatched_sizes=True
).eval()

Any help or insights would be greatly appreciated!

Thank you!

Qwen org

try changing the value of vocab_size in config.json

Sign up or log in to comment