Add chat_template to tokenizer_config.json?

#1
by leonardlin - opened

I was trying to test out the model (to run JA MT-Bench) and was not getting very good results. One issue may be that the chat formatting is wrong - it appears to be Vicuna like, but the model card does not specify what system prompts should look like. One thing that might help with formatting would be to add a chat_template to the tokenizer_config.json?

Barring that, any parameters that you recommend (if there's a recommended system prompt, repetition penalty or other sampling parameters, etc) would be useful.

OrionStarAI org

Thanks! I'll add it today!

OrionStarAI org

done

@DachengZhang - is there support for a system prompt? Maybe it'd look something like:

  "chat_template": "{% for message in messages %}{% if loop.first %}{{ bos_token }}{% endif %}
{% if message['role'] == 'system' %}{{ message['content'] + '\n\n ' }}{% elif message['role'] == 'user' %}{{ 'Human: ' + message['content'] + '\n\nAssistant: ' + eos_token }}{% elif message['role'] == 'assistant' %}{{ message['content'] + eos_token }}{% endif %}{% endfor %}",

Or is it something it's completely untrained for? (If not that's fine, it might be better to train off the Base in any case)

OrionStarAI org

Yes, this model is not trained for the system role. We will add the system role in the next version.

Sign up or log in to comment