different max_position_embeddings and rope_theta in and OpenR1-Qwen-7B-SFT and it's base Qwen2.5-Math-7B-Instruct ?
#3
by
zhuzhuyue
- opened
how to use open-r1 project to get this model.
I find "max_position_embeddings" of Qwen2.5-Math-7B-Instruct is 4096, and "rope_theta" is 10000.0, but in OpenR1-Qwen-7B, "max_position_embeddings" is 32768, and "rope_theta" is 300000.0. why these value are different, and how to get such result?
If use the demo config of recipes/openr1-qwen-7b/sft/config.yaml, the value will be the same.
Hello, you need to:
- download locally the model (using for instance
hugginface-cli download Qwen2.5-Math-7B-Instruct --local-dir <your-local-directory>
) - change the
rope_theta
andmax_position_embedding
in your local folder (inconfig.json
file) - replace the
model_name_or_path
in theconfig.yaml
by the local model