Prompt format

#12
by Thireus - opened

Proof of concept to construct valid Mixtral prompts using Python:

Set up a Python environment:

virtualenv-3.10 --python=python3.10 ~/test
cd test
source bin/activate*
pip install mistral-common transformers jinja2

Create a file named modified_script.py (modified Python script from https://huggingface.co/mistralai/Mixtral-8x22B-Instruct-v0.1):

from mistral_common.protocol.instruct.messages import (
    AssistantMessage,
    UserMessage,
)
from mistral_common.tokens.tokenizers.mistral import MistralTokenizer
from mistral_common.tokens.instruct.normalize import ChatCompletionRequest

from transformers import AutoTokenizer

tokenizer_v3 = MistralTokenizer.v3()

mistral_query = ChatCompletionRequest(
    messages=[
        UserMessage(content="How many experts ?"),
        AssistantMessage(content="8"),
        UserMessage(content="How big ?"),
        AssistantMessage(content="22B"),
        UserMessage(content="Noice πŸŽ‰ !"),
    ],
    model="test",
)
hf_messages = mistral_query.model_dump()['messages']

tokenizer_hf = AutoTokenizer.from_pretrained('mistralai/Mixtral-8x22B-Instruct-v0.1')

print(tokenizer_hf.apply_chat_template(hf_messages, tokenize=False))

Execute the script:

python modified_script.py

Output of the above command:

<s> [INST] How many experts ? [/INST] 8 </s> [INST] How big ? [/INST] 22B </s> [INST] Noice πŸŽ‰ ! [/INST]

Hope this helps.

Just realized it's also documented here:

https://huggingface.co/docs/transformers/main/en/chat_templating

Thireus changed discussion status to closed

Sign up or log in to comment