About sft got unstoppable \n when output
#10
by
lucasjin
- opened
I found no matter how long I training, using 4k max length, the output always have many \n which can not stopped,
why?
Which framework were you using to finetine this? Does it do padding using '\n'? We have provided finetuning example at https://github.com/QwenLM/Qwen. Please take a look.
For SFT/Chat models, a dialogue pattern needs to be designed. Qwen-Chat uses ChatML (<|im_start|>user\n...<|im_end|>\n<|im_start|>assistant\n...<|im_end|>\n
) and some uses plaintext, e.g., \n\nHuman: ...\n\nAssistant: ...
The generation is ended by detecting turn boundaries, e.g., <|im_start|>
and <|im_end|>
or \n\nHuman:
and \n\nAssistant:
.
jklj077
changed discussion status to
closed