Missing Chat Template

#1
by dfrank - opened

I noticed that the tokenizer does not have a chat_template defined. So I created one based on the details available here

tokenizer.chat_template = (
    "{{bos_token}}"
    "{% for message in messages %}"
        "<|start_header_id|>{{message['role']}}<|end_header_id|>\n\n{{message['content']}}<|eot_id|>"
    "{% endfor %}"
    "{% if add_generation_prompt%}<|start_header_id|>assistant<|end_header_id|>\n\n"
        "{% else %}{{eos_token}}"
    "{% endif %}"
)

chat = [
    {"role": "system", "content": "You are a very useful assistant"},
    {"role": "assistant", "content": "I'm doing great. How can I help you today?"},
    {"role": "user", "content": "I'd like to show off how chat templating works!"},
]

print(tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True))
<|begin_of_text|><|start_header_id|>system<|end_header_id|>

You are a very useful assistant<|eot_id|><|start_header_id|>assistant<|end_header_id|>

I'm doing great. How can I help you today?<|eot_id|><|start_header_id|>user<|end_header_id|>

I'd like to show off how chat templating works!<|eot_id|><|start_header_id|>assistant<|end_header_id|>

Noob question: The sample Colab sheets are using Alpaca prompt. Is this update necessary and what extra benefits does this update provides? (since that is working)

thanks.

Bro this is the base model, not instruct

I know, but I wanted to train the base model to follow a set of instructions and I found the template useful. Now I don't think there are any significant differences between this template and the Alpaca one. Perhaps the only difference is that this template takes advantage of special tokens while the alpaca one does not.

Unsloth AI org

Oh we have chat template support in Unsloth if that works - ill release a conversational notebook today, but for Mistral: https://colab.research.google.com/drive/1Aau3lgPzeZKQ-98h69CCu1UJcvIBLmy2?usp=sharing And also https://github.com/unslothai/unsloth/wiki#chat-templates

@ewre324 Alpaca is fine as is, no need to change

Ok, I actually need to correct myself. Using the original Llama 3 template over the base model does lead to problems when training with LoRA. I think this is because the special token weights are not initialized in the base model, as Daniel explains in this post

If I understand correctly, the solution is to either use a different template (like Alpaca) or train the lm_head and embed_tokens layers.

Sign up or log in to comment