chat_template seems like it was converted incorrectly.

#10
by xzuyn - opened

The newlines and whitespace don't seem to be converted to a string correctly.

It should be something like this I assume. But even with this change the system turn isn't handled correctly. When you have system + user it looks to be formatted correctly, but when you have system + user + assistant the system turn disappears.

"chat_template": "{%- if messages[0]['role'] == 'system' %}{%- set system_message = messages[0]['content'] %}{%- set loop_messages = messages[1:] %}{%- else %}{%- set loop_messages = messages %}{%- endif %}{{- bos_token }}{%- for message in loop_messages %}{%- if (message['role'] == 'user') != (loop.index0 % 2 == 0) %}{{- raise_exception('After the optional system message, conversation roles must alternate user/assistant/user/assistant/...') }}{%- endif %}{%- if message['role'] == 'user' %}{%- if loop.last and system_message is defined %}{{- '[INST] ' + system_message + '\n\n' + message['content'] + '[/INST]' }}{%- else %}{{- '[INST] ' + message['content'] + '[/INST]' }}{%- endif %}{%- elif message['role'] == 'assistant' %}{{- ' ' + message['content'] + eos_token}}{%- else %}{{- raise_exception('Only user and assistant roles are supported, with the exception of an initial optional system message!') }}{%- endif %}{%- endfor %}"

Without this change it formats like this:
image.png

With this change it formats like this:
Screenshot from 2024-07-18 21-22-54.png

The messages used for these screenshots are these, used within Xenova's Jinja Playground HF Space.

[
  {
    "role": "system",
    "content": "You are a friendly chatbot who always responds in the style of a pirate"
  },
  {
    "role": "user",
    "content": "How many helicopters can a human eat in one sitting?"
  },
  {
    "role": "assistant",
    "content": "Arrg. What be a helicopter?"
  }
]
Mistral AI_ org

Hi there! 👋 I believe the problem is that you copied the escaped string into the left panel (including the backslashes, which won’t be there when unescaped)

I have it copied over correctly this time. There is still this issue though:

When you have system + user it looks to be formatted correctly, but when you have system + user + assistant the system turn disappears.

Screenshot from 2024-07-25 21-40-08.png
Screenshot from 2024-07-25 21-39-54.png

But even with this change the system turn isn't handled correctly. When you have system + user it looks to be formatted correctly, but when you have system + user + assistant the system turn disappears.

This can be fixed by changing {%- if loop.last and system_message is defined %}, for example to {%- if loop.index<=1 and system_message is defined %}, so that now the system prompt will be included before the user message if the user message is the very first message in the loop_message list

Mistral AI_ org

Its seems that the issue at hand will never happen user wise. Since the model was trained with system/user/assistant/user/assistant conversations you will always have an user as the last message before sending to the model and not an assistant. When the last message is an user there is no issue. The solution provided by @jackzhang should work, but in practice the original one should be good enough!

Is there a circonstance where you would want to send the model a completion with an assistant as the last message? @xzuyn

But even with this change the system turn isn't handled correctly. When you have system + user it looks to be formatted correctly, but when you have system + user + assistant the system turn disappears.

This can be fixed by changing {%- if loop.last and system_message is defined %}, for example to {%- if loop.index<=1 and system_message is defined %}, so that now the system prompt will be included before the user message if the user message is the very first message in the loop_message list

Yea that seems to fix it.

Is there a circonstance where you would want to send the model a completion with an assistant as the last message?

Using the chat_template to format and tokenize samples with the correct format for training is one. From what I can see @jackzhang 's edit would make it good.

Sign up or log in to comment