openGPT-X/Teuken-7B-instruct-research-v0.4 · Inference servers fail with the multiple chat templates

24 days ago

•

Hi!

I tried to fire the model with inference servers as vLLM and llama.cpp (as GGUF) and both fail when detecting the multiple chat templates for the different languages.

vllm        | ERROR 11-27 06:51:09 serving_chat.py:170] ValueError: This model has multiple chat templates with no default specified! Please either pass a chat template or the name of the template you wish to use to the `chat_template` argument. Available template names are ['BG', 'CS', 'DA', 'DE', 'EL', 'EN', 'ES', 'ET', 'FI', 'FR', 'GA', 'HR', 'HU', 'IT', 'LT', 'LV', 'MT', 'NL', 'PL', 'PT', 'RO', 'SK', 'SL', 'SV']

You can not choose the template names. You can only specify a complete chat template as alternative.

A more generic chat template as used by Mistral & Co would solve this.

May approach:

{%- if messages[0]["role"] == "system" %}
{{- messages[0]['role']|capitalize + ': ' + messages[0]['content'] + '\\n' }}
{%- set loop_messages = messages[1:] %}
{%- else %}
System: Ein Gespräch zwischen einem Menschen und einem Assistenten mit künstlicher Intelligenz. Der Assistent gibt hilfreiche und höfliche Antworten auf die Fragen des Menschen.{{- '\\n'}}
{%- set loop_messages = messages %}
{%- endif %}
{%- for message in loop_messages %}
{%- if (message['role']|lower == 'user') != (loop.index0 % 2 == 0) %}
{{- raise_exception('Roles must alternate User/Assistant/User/Assistant/...') }}
{%- endif %}
{%- if message['role']|lower == 'user' %}
{{- message['role']|capitalize + ': ' + message['content'] + '\\n' }}
{%- elif message['role']|lower == 'assistant' %}
{{- message['role']|capitalize + ': ' + message['content'] + eos_token + '\\n' }}
{%- else %}
{{- raise_exception('Only user and assistant roles are supported!') }}
{%- endif %}
{%- endfor %}
{%- if add_generation_prompt %}
{{- 'Assistant: '}}
{%- endif %}

This worked for me. As far as I know the default fixed System prompt I added as fallback is not standard in those other templates.

Kind regards,

Christian Stelter

mbrack

24 days ago

The problem is that the tokenizer config of this model does not even specify a chat template. https://huggingface.co/openGPT-X/Teuken-7B-instruct-research-v0.4/blob/main/tokenizer_config.json

It would be great if the authors could specify what instruction template was used during training.

danielsteinigen

OpenGPT-X org 23 days ago

•

edited 22 days ago

Hi @stelterlab , thanks for your investigations. We now added a default system prompt, which is used, if no chat_template language is provided: https://huggingface.co/openGPT-X/Teuken-7B-instruct-research-v0.4/commit/0aaf3bb89362d880fe0495aa142d10cfb61f7419

@mbrack thanks for pointing this out. We have specified the chat template directly within the Tokenizer file: https://huggingface.co/openGPT-X/Teuken-7B-instruct-research-v0.4/blob/main/gptx_tokenizer.py#L458

We also added some sample for the usage with vLLM to the Readme: https://huggingface.co/openGPT-X/Teuken-7B-instruct-research-v0.4#usage-with-vllm-server