migtissera/Tess-7B-v2.0 · Add chat template to tokenizer

Handgun1773

Mar 26

Seems to work in ooba.

Add chat template to tokenizer_configfc059f4a

Suparious

Mar 26

What is that |safe doing?

migtissera

Owner Mar 26

Can someone confirm this? I don't run LLMs on Ooba. Is it good to go ahead and merge?

Handgun1773

Mar 26

•

edited Mar 26

What is that |safe doing?

The |safe filter is used in the provided template to ensure that the content of the messages is rendered as plain text, without any potential HTML or script tags being interpreted or executed.
I did this quickly with claude at 2 am. But we may not way it in this context, since it may sanitize chars we actually want to send the LLM.

Can someone confirm this? I don't run LLMs on Ooba. Is it good to go ahead and merge?

When we figure out a nice way to add the chat template, I think it would be a good practice to include it for further releses. It's becoming standard, and used by many backends. Personally, I run models with an aphrodite engine docker, and the engine go look for this config to manage the messages I send it through api.

Update tokenizer config with inspiration from eric's dolphin.5d643fea

Handgun1773

Mar 26

•

edited Mar 26

Here's a revised version. We should figure out if we need to explicitly add the bos token. The exception is probably useless, since people use libraries and apis that will handle this.
Maybe @ehartford can advise us?