different max_position_embeddings and rope_theta in and OpenR1-Qwen-7B-SFT and it's base Qwen2.5-Math-7B-Instruct ?

#3
by zhuzhuyue - opened

how to use open-r1 project to get this model.
I find "max_position_embeddings" of Qwen2.5-Math-7B-Instruct is 4096, and "rope_theta" is 10000.0, but in OpenR1-Qwen-7B, "max_position_embeddings" is 32768, and "rope_theta" is 300000.0. why these value are different, and how to get such result?
If use the demo config of recipes/openr1-qwen-7b/sft/config.yaml, the value will be the same.

Hello, you need to:

  1. download locally the model (using for instance hugginface-cli download Qwen2.5-Math-7B-Instruct --local-dir <your-local-directory>)
  2. change the rope_theta and max_position_embedding in your local folder (in config.json file)
  3. replace the model_name_or_path in the config.yaml by the local model

Sign up or log in to comment