mistralai/Mistral-7B-Instruct-v0.1 · apply_chat_template result of Mistral is not restrictly align to the template on its website

(This is a crossposting of this issue)


from transformers import AutoTokenizer
chat = [
  {"role": "user", "content": "USER_INSTRUCTION_1"},
  {"role": "assistant", "content": "RESPONSE_1"},
  {"role": "user", "content": "USER_INSTRUCTION_2"},
  {"role": "assistant", "content": "RESPONSE_2"},
]
res_apply_chat_template = tokenizer.apply_chat_template(chat, tokenize=False)
res_mistral_website = '<s>[INST] USER_INSTRUCTION_1 [/INST] RESPONSE_1</s>[INST] USER_INSTRUCTION_2 [/INST] RESPONSE_2</s>'
print(res_apply_chat_template)   
print(res_mistral_website)

The result is:

'<s>[INST] USER_INSTRUCTION_1 [/INST]RESPONSE_1</s> [INST] USER_INSTRUCTION_2 [/INST]RESPONSE_2</s> '
'<s>[INST] USER_INSTRUCTION_1 [/INST] RESPONSE_1</s>[INST] USER_INSTRUCTION_2 [/INST] RESPONSE_2</s>'

There are two main difference:

According to the Mistral 7B website: https://docs.mistral.ai/usage/guardrailing/#appendix
There is always a blank after [INST] and [/INST], but result of apply_chat_template seems not following it.
In res_apply_chat_template, There is an additional blank at the end of a turn.

I also encode the two sentences and decode them back. The results show that the word will be tokenized into different tokens because of the blank after [/INST]:

decoded_apply_chat_template = []
for a in ids_apply_chat_template:
    decoded_apply_chat_template.append(tokenizer.decode(a))

decoded_mistral_website = []
for b in ids_mistral_website:
    decoded_mistral_website.append(tokenizer.decode(b))

#decoded_apply_chat_template
['<s>', '[', 'INST', ']', 'US', 'ER', '_', 'IN', 'STRU', 'CTION', '_', '1', '[', '/', 'INST', ']', 'RE', 'SP', 'ON', 'SE', '_', '1', '</s>', '', '[', 'INST', ']', 'US', 'ER', '_', 'IN', 'STRU', 'CTION', '_', '2', '[', '/', 'INST', ']', 'RE', 'SP', 'ON', 'SE', '_', '2', '</s>', ' ']

#decoded_mistral_website
['<s>', '[', 'INST', ']', 'US', 'ER', '_', 'IN', 'STRU', 'CTION', '_', '1', '[', '/', 'INST', ']', 'RES', 'P', 'ON', 'SE', '_', '1', '</s>', '[', 'INST', ']', 'US', 'ER', '_', 'IN', 'STRU', 'CTION', '_', '2', '[', '/', 'INST', ']', 'RES', 'P', 'ON', 'SE', '_', '2', '</s>']

I guess it's okay to do either way, but shall we better to align with how it was been done during finetuning?