Add chat template to tokenizer_config

#1

Seems to work in ooba.

What is that |safe doing?

Can someone confirm this? I don't run LLMs on Ooba. Is it good to go ahead and merge?

What is that |safe doing?

The |safe filter is used in the provided template to ensure that the content of the messages is rendered as plain text, without any potential HTML or script tags being interpreted or executed.
I did this quickly with claude at 2 am. But we may not way it in this context, since it may sanitize chars we actually want to send the LLM.

Can someone confirm this? I don't run LLMs on Ooba. Is it good to go ahead and merge?

When we figure out a nice way to add the chat template, I think it would be a good practice to include it for further releses. It's becoming standard, and used by many backends. Personally, I run models with an aphrodite engine docker, and the engine go look for this config to manage the messages I send it through api.

Here's a revised version. We should figure out if we need to explicitly add the bos token. The exception is probably useless, since people use libraries and apis that will handle this.
Maybe @ehartford can advise us?

Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment