Prompt format for fine-tuning

#51
by skevja - opened

I have my custom dataset in the form of question-answer pairs and want to use it for fine-tuning. I'm not sure though how to format this data for command-r.
Should I wrap each question into "<|START_OF_TURN_TOKEN|><|USER_TOKEN|> {QUESTION}<|END_OF_TURN_TOKEN|>" and answer into "<|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|> {ANSWER}<|END_OF_TURN_TOKEN|>"?
Will this be enough?

Sign up or log in to comment