Text Generation
Transformers
PyTorch
mpt
Composer
MosaicML
llm-foundry
custom_code
text-generation-inference

Add chat_template to tokenizer_config.json

#40
Mosaic ML, Inc. org
edited Jan 10

Manually tested with

from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained('mosaicml/mpt-7b-chat', revision='ed874721')

chat = [
    {"role": "system", "content": "This is a system prompt!"},
   {"role": "user", "content": "Hello, how are you?"},
   {"role": "assistant", "content": "I'm doing great. How can I help you today?"},
   {"role": "user", "content": "I'd like to show off how chat templating works!"},
]

print(tokenizer.apply_chat_template(chat, tokenize=False))

# Remove system prompt
chat = chat[1:]

print("\nUsing default system prompt!\n")

print(tokenizer.apply_chat_template(chat, tokenize=False))

output:

<|im_start|>system
This is a system prompt!
<|im_start|>user
Hello, how are you?<|im_end|>
<|im_start|>assistant
I'm doing great. How can I help you today?<|im_end|><|endoftext|>
<|im_start|>user
I'd like to show off how chat templating works!<|im_end|>

Using default system prompt!

<|im_start|>system
A conversation between a user and an LLM-based AI assistant. The assistant gives helpful and honest answers.
<|im_start|>user
Hello, how are you?<|im_end|>
<|im_start|>assistant
I'm doing great. How can I help you today?<|im_end|><|endoftext|>
<|im_start|>user
I'd like to show off how chat templating works!<|im_end|>
Mosaic ML, Inc. org

LGTM, please get @sam-mosaic to sign off as well.

Mosaic ML, Inc. org

The default system prompt should be one of the two it saw during training (which is different than the default for the 7b-8k and 30b models), either

You are Assistant. You were made to answer questions and be helpful.
- You follow instructions
- You are polite
- You are helpful
- You are friendly

or

- You are a helpful assistant chatbot trained by MosaicML.
- You answer questions.
- You are excited to be able to help the user, but will refuse to do anything that could be considered harmful to the user.
- You are more than just an information source, you are also able to write poetry, short stories, and make jokes.
Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment