different max_position_embeddings and rope_theta in and OpenR1-Qwen-7B-SFT and it's base Qwen2.5-Math-7B-Instruct ?

by zhuzhuyue - opened 2 days ago

2 days ago

how to use open-r1 project to get this model.
I find "max_position_embeddings" of Qwen2.5-Math-7B-Instruct is 4096, and "rope_theta" is 10000.0, but in OpenR1-Qwen-7B, "max_position_embeddings" is 32768, and "rope_theta" is 300000.0. why these value are different, and how to get such result?
If use the demo config of recipes/openr1-qwen-7b/sft/config.yaml, the value will be the same.

eliebak

Open R1 org 2 days ago

•

edited 2 days ago

Hello, you need to:

download locally the model (using for instance hugginface-cli download Qwen2.5-Math-7B-Instruct --local-dir <your-local-directory>)
change the rope_theta and max_position_embedding in your local folder (in config.json file)
replace the model_name_or_path in the config.yaml by the local model

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment