Text Generation
Transformers
PyTorch
mistral
openchat
C-RLFT
conversational
Inference Endpoints
text-generation-inference

How to setup system message

#5
by fernandofernandes - opened

Hi! Congratulations for your awesome work. In the prompt template, how do we setup the system message?

Agreed, and, what is the prompt template in general?

I see this in a random space that claims to host openchat, although the space is erroring: https://huggingface.co/spaces/rishiraj/OpenChat/blob/main/app.py#L9

deleted

I also found this: https://huggingface.co/spaces/TogetherAI/EinfachLlaMistral/blob/main/app.py#L8

def format_prompt(message, history):
    prompt = "<s>"
    prompt += ("[IDENTITY] You are Ailex, a clone and close collaborator of Einfach.Alex. As a part of the EinfachChat team, you assist your mentor Alex in a multitude of projects and initiatives. Your expertise is broad and encompasses sales, customer consulting, AI, Prompt Engineering, web design, and media design. Your life motto is 'Simply.Do!'. You communicate exclusively in German. [/IDENTITY]")
    for user_prompt, bot_response in history:
        prompt += f"[INST] {user_prompt} [/INST]"
        prompt += f" {bot_response}</s> "
    prompt += f"[INST] {message} [/INST]"
    return prompt

It'd be nice if confirmed by the authors though.

deleted

Hmm, their github says this, but, leaves it to you to infer the correct template:

import transformers
tokenizer = transformers.AutoTokenizer.from_pretrained("openchat/openchat_3.5")

# Single-turn
tokens = tokenizer("GPT4 Correct User: Hello<|end_of_turn|>GPT4 Correct Assistant:").input_ids
assert tokens == [1, 420, 6316, 28781, 3198, 3123, 1247, 28747, 22557, 32000, 420, 6316, 28781, 3198, 3123, 21631, 28747]

# Multi-turn
tokens = tokenizer("GPT4 Correct User: Hello<|end_of_turn|>GPT4 Correct Assistant: Hi<|end_of_turn|>GPT4 Correct User: How are you today?<|end_of_turn|>GPT4 Correct Assistant:").input_ids
assert tokens == [1, 420, 6316, 28781, 3198, 3123, 1247, 28747, 22557, 32000, 420, 6316, 28781, 3198, 3123, 21631, 28747, 15359, 32000, 420, 6316, 28781, 3198, 3123, 1247, 28747, 1602, 460, 368, 3154, 28804, 32000, 420, 6316, 28781, 3198, 3123, 21631, 28747]

# Coding Mode
tokens = tokenizer("Code User: Implement quicksort using C++<|end_of_turn|>Code Assistant:").input_ids
assert tokens == [1, 7596, 1247, 28747, 26256, 2936, 7653, 1413, 334, 1680, 32000, 7596, 21631, 28747]
OpenChat org

@fernandofernandes The system prompt should be appended before the conversation and ended with <|end_of_turn|> as shown below

You are a helpful assistant.<|end_of_turn|>GPT4 Correct User: Hello<|end_of_turn|>GPT4 Correct Assistant:

@imone Do insert GPT4 Correct User: and GPT4 Correct Assistant : before query and response? Can we add Turbo also ?

OpenChat org

Hmm, their github says this, but, leaves it to you to infer the correct template:

import transformers
tokenizer = transformers.AutoTokenizer.from_pretrained("openchat/openchat_3.5")

# Single-turn
tokens = tokenizer("GPT4 Correct User: Hello<|end_of_turn|>GPT4 Correct Assistant:").input_ids
assert tokens == [1, 420, 6316, 28781, 3198, 3123, 1247, 28747, 22557, 32000, 420, 6316, 28781, 3198, 3123, 21631, 28747]

# Multi-turn
tokens = tokenizer("GPT4 Correct User: Hello<|end_of_turn|>GPT4 Correct Assistant: Hi<|end_of_turn|>GPT4 Correct User: How are you today?<|end_of_turn|>GPT4 Correct Assistant:").input_ids
assert tokens == [1, 420, 6316, 28781, 3198, 3123, 1247, 28747, 22557, 32000, 420, 6316, 28781, 3198, 3123, 21631, 28747, 15359, 32000, 420, 6316, 28781, 3198, 3123, 1247, 28747, 1602, 460, 368, 3154, 28804, 32000, 420, 6316, 28781, 3198, 3123, 21631, 28747]

# Coding Mode
tokens = tokenizer("Code User: Implement quicksort using C++<|end_of_turn|>Code Assistant:").input_ids
assert tokens == [1, 7596, 1247, 28747, 26256, 2936, 7653, 1413, 334, 1680, 32000, 7596, 21631, 28747]

@Saugatkafley Yes, as shown in the example above. BTW, what do you mean Turbo?

Oh, thanks . I was asking , like GPT4 , GPT3.5-turbo name works or not while prompting . Anyway . Thank you!VM

If my

@fernandofernandes The system prompt should be appended before the conversation and ended with <|end_of_turn|> as shown below

You are a helpful assistant.<|end_of_turn|>GPT4 Correct User: Hello<|end_of_turn|>GPT4 Correct Assistant:

I tried to write up a fastchat conversation. It is as follows for Multi-turn Chat

# https://huggingface.co/openchat/openchat_3.5/discussions/5#65448109b4a3f3a2f486fd9d
register_conv_template(
    Conversation(
        name="openchat-v3.5",
        system_template="""{system_message}""",
        roles=("GPT4 Correct User", "GPT4 Correct Assistant"),
        messages=(),
        offset=0,
        sep_style=SeparatorStyle.FALCON_CHAT,
        sep="<|end_of_turn|>",
        stop_str=["</s>", "<|end_of_turn|>"],
    )
)

@fernandofernandes The system prompt should be appended before the conversation and ended with <|end_of_turn|> as shown below

You are a helpful assistant.<|end_of_turn|>GPT4 Correct User: Hello<|end_of_turn|>GPT4 Correct Assistant:

should we always use the system prompt when we chat with the openchat models?

deleted

This would be nice if it were added into tokenizers, EG tokenizer.apply_chat_template( ... )

This would be nice if it were added into tokenizers, EG tokenizer.apply_chat_template( ... )

I believe the template for Chat is: "{% for message in messages %}{% if message['role'] == 'user' %}{{ 'GPT4 Correct User: ' + message['content'] + eos_token }}{% elif message['role'] == 'system' %}{{ message['content'] + eos_token }}{% elif message['role'] == 'assistant' %}{{ 'GPT4 Correct Assistant: ' + message['content'] + eos_token }}{% endif %}{% if loop.last and add_generation_prompt %}{{ 'GPT4 Correct Assistant: ' }}{% endif %}{% endfor %}" and for Code Assistant, Just replace GPT4 Correct by Code.
You can add your chat_template by simple code.

custom_template = "some_template"
tokenizer.chat_template = custom_template
This comment has been hidden

Hmm, their github says this, but, leaves it to you to infer the correct template:

import transformers
tokenizer = transformers.AutoTokenizer.from_pretrained("openchat/openchat_3.5")

# Single-turn
tokens = tokenizer("GPT4 Correct User: Hello<|end_of_turn|>GPT4 Correct Assistant:").input_ids
assert tokens == [1, 420, 6316, 28781, 3198, 3123, 1247, 28747, 22557, 32000, 420, 6316, 28781, 3198, 3123, 21631, 28747]

# Multi-turn
tokens = tokenizer("GPT4 Correct User: Hello<|end_of_turn|>GPT4 Correct Assistant: Hi<|end_of_turn|>GPT4 Correct User: How are you today?<|end_of_turn|>GPT4 Correct Assistant:").input_ids
assert tokens == [1, 420, 6316, 28781, 3198, 3123, 1247, 28747, 22557, 32000, 420, 6316, 28781, 3198, 3123, 21631, 28747, 15359, 32000, 420, 6316, 28781, 3198, 3123, 1247, 28747, 1602, 460, 368, 3154, 28804, 32000, 420, 6316, 28781, 3198, 3123, 21631, 28747]

# Coding Mode
tokens = tokenizer("Code User: Implement quicksort using C++<|end_of_turn|>Code Assistant:").input_ids
assert tokens == [1, 7596, 1247, 28747, 26256, 2936, 7653, 1413, 334, 1680, 32000, 7596, 21631, 28747]

@Saugatkafley Yes, as shown in the example above. BTW, what do you mean Turbo?

Thank you for your work!
But one question does not let go not only me - why formatted prompt has the following structure:
GPT4 User: Question<|end of turn|>GPT4 Assistant:

instead of:
GPT4 User: Question<|end of turn|>
GPT4 Assistant:

Sign up or log in to comment