About sft got unstoppable \n when output

#10
by lucasjin - opened

I found no matter how long I training, using 4k max length, the output always have many \n which can not stopped,
why?

Which framework were you using to finetine this? Does it do padding using '\n'? We have provided finetuning example at https://github.com/QwenLM/Qwen. Please take a look.

For SFT/Chat models, a dialogue pattern needs to be designed. Qwen-Chat uses ChatML (<|im_start|>user\n...<|im_end|>\n<|im_start|>assistant\n...<|im_end|>\n) and some uses plaintext, e.g., \n\nHuman: ...\n\nAssistant: ...
The generation is ended by detecting turn boundaries, e.g., <|im_start|> and <|im_end|> or \n\nHuman: and \n\nAssistant:.

jklj077 changed discussion status to closed

Sign up or log in to comment