BAAI/Infinity-Instruct-3M-0625-Qwen2-7B · Does two stage training use same hyperparamers?

Aug 25

In model card. There is a description:

First, we apply the foundational dataset Infinity-Instruct-3M to improve the foundational ability (math & code) of Qwen2-7B, and get the foundational instruct model Infinity-Instruct-3M-Qwen2-7B. Then we finetune the Infinity-Instruct-3M-Qwen2-7B to get the stronger chat model Infinity-Instruct-3M-0625-Qwen2-7B. Here is the training hyperparamers.

Question: there are two stages and only one group training hyperparamers. so does both two stage SFT training use same hyperparamers?

hyxmmm

Beijing Academy of Artificial Intelligence org Sep 3

Yes, you can use the same set of hyperparameters for the two stages training.

liweipe

Sep 10

Hello, which template do you use to fine tune from pretrained model to Foundational Instruct model? I assume you used the chat template to finetune the final chat model but what about the intermediate stage. Also use chat template with system prompt?