Multiple eos tokens in chat template.
#95
by
BobbertWobbert
- opened
Noticed that there is a question concerning multiple eos characters in the chat template that was not fully answered in this post.
The chat template adds eos characters after every pair of user-assistant messages. Just want to confirm that this is intended and not a bug in the chat template? Thanks.
Example:
print(tokenizer.apply_chat_template([{'role':'user','content':'inst1'},{'role':'assistant','content':'aswer1'}, {'role':'user','content':'inst2'},{'role':'assistant','content':'answ2'}],tokenize=False,add_generation_prompt=False))
<s>[INST] inst1 [/INST]aswer1</s> [INST] inst2 [/INST]answ2</s>
My understanding is </s>
marks the end of a response, or we don't know when to stop during inference.</s>