Update chat template to resemble the prompt as stated in the model card.

#176

by nilsec - opened Mar 21, 2024

base: refs/heads/main

←

from: refs/pr/176

Discussion Files changed

-1

nilsec

Mar 21, 2024

•

edited Mar 21, 2024

While switching backends, we encountered quite severe downgrading in our Mixtral model generation results.
Diving deeper into this issues, we found that the tokenizer relied on the HF model config and used the chat_tempate from there aswell.

For the following input:

messages = [
    {"role": "user", "content":"Hello, how are you?"},
    {"role": "assistant", "content":"Good, how are you?"},
    {"role": "user", "content":"Very good!"}
]

converstion_string = tokenizer.apply_chat_template(
    message,
    tokenize=False,
    add_generation_prompt=True,
)

print(converstion_string) -> "<s>[INST] Hello, how are you? [/INST]Good, how are you?</s>[INST] Very good! [/INST]"

The model card states that this should be

<s> [INST] Hello, how are you? [/INST] Good, how are you?</s> [INST] Very good! [/INST]

Although the difference is very limited (only 2 spaces for a 3 message conversation, the difference in generation results is very big.
With the current implementation we saw very often that the model predicted the <eos> token in unexpected places, especailly since we are generating structured output and the returned structure was therefore not valid. This change has overcome this issue. Furthermore, the generated output is over better quality in terms of expected structured output.

Update chat template to resemble the prompt as stated in the model card.470bbc66

nilsec

Mar 21, 2024

Related to the following issue: https://github.com/vllm-project/vllm/issues/2464

ndurkee

Mar 25, 2024

Just wanted to say that this seems to have fixed a lot of the issues I was having with my code.

epignatelli

May 24, 2024

Why is there no comment from the official maintainers, given the issue is so fundamental?
Also, why has the fix not merged yet?

nrepesh

Jun 13, 2024

Commenting for visibility.

johannesvass

Jul 10, 2024

Did anybody check whether the recent commit https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1/commit/bbae113847402a22031211225b5ee45c005de7dd fixed this?

pandora-s

Mistral AI_ org Jul 10, 2024

Did anybody check whether the recent commit https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1/commit/bbae113847402a22031211225b5ee45c005de7dd fixed this?

The new template should be more accurate yes 🔥

nrepesh

Jul 10, 2024

•

edited Jul 10, 2024

Still noticing <eos> token being predicted in unexpected places. Any changes that need to be made to this? <s> [INST] Hello, how are you? [/INST]

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Cannot merge

This branch has merge conflicts in the following files:

tokenizer_config.json

· Sign up or log in to comment