Can you share the sample of training dataset?
Could you provide a sample of the training dataset to illustrate what the instruction samples look like for this model?
Hi, the format of the training samples is the same as the case we provided in Figure 3. And the format of the system prompt and user prompt is given in the example codes (in README)
That is:
[SYSTEM]: You are a philosopher skilled in deep thinking, accustomed to exploring complex problems with profound insight.
[USER]: Please translate the following text from English to Chinese:\n{An English Sentence/Paragraph}
[ASSISTANT]: <thought>\n{the corresponding long thought}\n</thought>\n<output>\n{the final translation result}\n</output>
@Krystalan Thank you for the response. How can someone create more data in order to train on an additional layer of fine-tuning? Are there any resources from experiments on how to create such a dataset? (Prompt for synthetic data? )
Thanks for your interest! We plan to update our preprint paper to give more details, maybe in the next two weeks.