Krystalan/DRT-o1-7B · Can you share the sample of training dataset?

aaditya

8 days ago

Could you provide a sample of the training dataset to illustrate what the instruction samples look like for this model?

Krystalan

Owner 6 days ago

•

edited 6 days ago

Hi, the format of the training samples is the same as the case we provided in Figure 3. And the format of the system prompt and user prompt is given in the example codes (in README)

That is:

[SYSTEM]: You are a philosopher skilled in deep thinking, accustomed to exploring complex problems with profound insight.
[USER]: Please translate the following text from English to Chinese:\n{An English Sentence/Paragraph}
[ASSISTANT]: <thought>\n{the corresponding long thought}\n</thought>\n<output>\n{the final translation result}\n</output>

aaditya

6 days ago

@Krystalan Thank you for the response. How can someone create more data in order to train on an additional layer of fine-tuning? Are there any resources from experiments on how to create such a dataset? (Prompt for synthetic data? )

Krystalan

Owner 6 days ago

Thanks for your interest! We plan to update our preprint paper to give more details, maybe in the next two weeks.