Prompt template for question answering

#49
by gxxxz - opened

How can i use this model for question answering, I want to pass some context and the question and model should get the data from context and answer question. Is there any prompt format for this?

Pass this to llm <s>[INST] Using this information : {context} answer the Question : {query} [/INST] , you can look into prompt templating , through langchain too if you haven't.

This comment has been hidden

Btw for future peeps, below also sets the context inside the tags.

{"role": "system", "content": "You are 8 years old"},
{"role": "user", "content": "How old are you?"},

You can try something like this:

messages = [
{"role": "system", "content": "You are a helpful bot who reads texts and answers questions about them."},
{"role": "user", "content": "[text] QUESTION: [question]"},
]
input = tokenizer.apply_chat_template(messages)
answer = model.generate(**{key: tensor.to(model.device) for key, tensor in input.items()})

In general, there are lots of ways to do this and no single right answer - try using some of the tips from OpenAI's prompt engineering handbook, which also apply to other instruction-following models like Mistral-Instruct. Finally, you may have better luck with Zephyr, which is also based on Mistral-7B but was trained to follow instructions with more advanced methods.

@Rocketknight1 I tried your suggestion and got the following error jinja2.exceptions.TemplateError: Conversation roles must alternate user/assistant/user/assistant/...

I've added my python code below to see if you can see anything out of the ordinary with it.

from transformers import AutoModelForCausalLM, AutoTokenizer

device = "cpu" # Use GPU if available, otherwise use CPU

model = AutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-Instruct-v0.1")
tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-Instruct-v0.1")

messages = [
{"role": "system", "content": "You are a hunan who loves to dance."},
{"role": "user", "content": "What do you like to do in your spare time?"}
]

encodeds = tokenizer.apply_chat_template(messages)

model_inputs = encodeds.to(device)
model.to(device)

generated_ids = model.generate(**{key: tensor.to(model.device) for key, tensor in input.items()})
decoded = tokenizer.batch_decode(generated_ids)
print(decoded[0])

Hi @adam12104 this is a known issue caused by your version of jinja being out of date. Try pip install --upgrade jinja2. We'll be adding version checks in the next update to transformers so that this stops happening!

Hmm, still getting that error about how conversation roles must alternate, and it seems my jinja2 is up-to-date

You can follow these format on your chat template (user first, then followed by assistant):

messages = [
{"role": "user", "content": "Hey there!"},
{"role": "assistant", "content": "Nice to meet you!"}
]

My prompt is
human_input = B_INST + f"""
Reminder: {reminder}
Process steps: {process_steps}
Query: {query}
Answer:
""" + E_INST
And my chatprompttemplate in langchain is
prompt = ChatPromptTemplate.from_messages([
SystemMessage(content=system_template), # The persistent system prompt
MessagesPlaceholder(variable_name="chat_history"), # Where the memory will be stored.
ChatPromptTemplate.from_template("{human_input}"), # Where the human input will be injected
])

The response it generates has a prefix of AI: AI:
When in a conversation session
Let us say i have passed in 5 queries the answer to the 5th query will have a prefix of AI: AI: AI: AI: AI:

Is this a prompt problem?

I run into the same problem as well. The problem is with the system message there. When I remove the system message I won't get the jinja2.exceptions.TemplateError: Conversation roles must alternate user/assistant/user/assistant/... error anymore..

[UPDATE]
okay, according to the tokenizer config here, the chat template does not support system messages. So I'm wondering how the model was trained for supporting system messages and how should we use it?

@navidmadani Did you find an answer to your question?

this worked for me:

messages = [
    {
        "role": "user",
        "content": "You are a chatbot who always responds in Portuguese",
    },
    {
        "role": "assistant",
        "content": "Entendido! Responderei sempre na lΓ­ngua portuguesa!",
    },
    {
        "role": "user",
        "content": "Qual Γ© maior: sol ou a terra?",
    }
]

print(tokenizer.apply_chat_template(messages, tokenize=False))

This is what HuggingChat is using:

{
    "name" : "mistralai/Mixtral-8x7B-Instruct-v0.1",
    "description" : "The latest MoE model from Mistral AI! 8x7B and outperforms Llama 2 70B in most benchmarks.",
    "websiteUrl" : "https://mistral.ai/news/mixtral-of-experts/",
    "preprompt" : "",
    "chatPromptTemplate": "<s> {{#each messages}}{{#ifUser}}[INST]{{#if 

@first

	}}{{#if 

@root

	.preprompt}}{{@root.preprompt}}\n{{/if}}{{/if}} {{content}} [/INST]{{/ifUser}}{{#ifAssistant}} {{content}}</s> {{/ifAssistant}}{{/each}}",
    "parameters" : {
      "temperature" : 0.6,
      "top_p" : 0.95,
      "repetition_penalty" : 1.2,
      "top_k" : 50,
      "truncate" : 24576,
      "max_new_tokens" : 8192,
      "stop" : ["</s>"]
    }

So they consider the System Prompt as the first message separated via \n:

<s> [INST]Previous conversation context or system prompt
 How's the weather today? [/INST] It's sunny and warm.
<s> [INST] What about tomorrow? [/INST] Tomorrow is expected to be cloudy.</s> 

This is how it is in text-generation-webui, which reuses the same thing from Mistral:

.*(mistral|mixtral).*instruct:
  instruction_template: 'Mistral'
instruction_template: |-
  {%- for message in messages %}
      {%- if message['role'] == 'system' -%}
          {{- message['content'] -}}
      {%- else -%}
          {%- if message['role'] == 'user' -%}
              {{-' [INST] ' + message['content'].rstrip() + ' [/INST] '-}}
          {%- else -%}
              {{-'' + message['content'] + '</s>' -}}
          {%- endif -%}
      {%- endif -%}
  {%- endfor -%}
  {%- if add_generation_prompt -%}
      {{-''-}}
  {%- endif -%}

and some users in SillyTavern are using the same Llama-2 format:

[INST] <<SYS>>
Write character's next reply.
<</SYS>>

Character card
</s><s>[INST] {prompt} [/INST] {response} </s><s>[INST] {prompt} [/INST] etc.

I personally tried all 3, in some cases I got better results with Llama-2 format for some reasons! I wish we had a good evaluation just for a System Prompt to see which formats does a better job.

Has anyone gotten any clarity on this, how to properly include a system prompt with tokenizer / apply_chat_template?

Has anyone gotten any clarity on this, how to properly include a system prompt with tokenizer / apply_chat_template?

You can use something like this

messages = [
{"role": "system", "content": "You are 8 years old"},
]
input = tokenizer.apply_chat_template(messages)

apply_chat_template() does not work with role type "system" for mistral's tokenizer as pointed out above.

The way we are getting around this is having two messages at the start to mimic a system prompt, which introduces roles at the start of the conversation. And we keep these two messages at the start of the message history even after the context length is exceeded and we start to omit earlier messages between the user and chatbot, just like a system prompt is supposed to work.

Example:

[
{"role": "user", "content": "Hi, ChatbotName! I'm ExampleUser."},
{"role": "assistant", "content": "Hi, ExampleUser! I'm ChatbotName, here to <introduce role and purpose of the chatbot!>."},
{"role": "user", "content": "<first prompt here!>"}
]

My prompt is
human_input = B_INST + f"""
Reminder: {reminder}
Process steps: {process_steps}
Query: {query}
Answer:
""" + E_INST
And my chatprompttemplate in langchain is
prompt = ChatPromptTemplate.from_messages([
SystemMessage(content=system_template), # The persistent system prompt
MessagesPlaceholder(variable_name="chat_history"), # Where the memory will be stored.
ChatPromptTemplate.from_template("{human_input}"), # Where the human input will be injected
])

The response it generates has a prefix of AI: AI:
When in a conversation session
Let us say i have passed in 5 queries the answer to the 5th query will have a prefix of AI: AI: AI: AI: AI:

Is this a prompt problem?

same for me, did you find a way to prevent this?

Sign up or log in to comment