Incomplete Answers

#59

by samparksoftwares - opened Dec 21, 2023

Dec 21, 2023

Model is giving incomplete answers or rather stops generating answers at any point despite increasing max_new_token and decreasing prompt's context length.

Thireus

Dec 23, 2023

•

edited Dec 23, 2023

I've had this behaviour as well until I realised I wasn't following the prompt format properly. Make sure you use the white spaces and BOS/EOS tokens appropriately. The EOS tokens are very important. And the whitespaces are also very important.

Prompt:

<s> [INST] What is 1+1? [/INST] The sum of 1 + 1 is 2. Is there anything else you would like to know about mathematics or another topic? I'm here to help with any questions you have to the best of my ability.</s> [INST] What is 1+2? [/INST]

Prompt answered:

<s> [INST] What is 1+1? [/INST] The sum of 1 + 1 is 2. Is there anything else you would like to know about mathematics or another topic? I'm here to help with any questions you have to the best of my ability.</s> [INST] What is 1+2? [/INST] The sum of 1 + 2 is 3. In arithmetic, addition is the operation of combining two numbers to produce a third number. For example, in the expression "2 + 3 = 5," 2 and 3 are the addends, and 5 is the sum. Addition can be represented using symbols, such as the plus sign "+", or by the word "and." It is one of the basic operations in mathematics, along with subtraction, multiplication, and division. Is there anything else you would like to know about mathematics or another topic? I'm here to help with any questions you have.

Very important:

After the BOS (<s>), there is a whitespace. There must be only one BOS in your prompt.
Before any EOS (</s>), there is no whitespace.
All instructions must be between [INST] and [/INST], notice the whitespace after and before them.
At the end of your prompt, there must be no whitespace after the last [/INST]. Completion will add it for you.
Make sure the software you use doesn't already add the BOS/EOS tokens to the prompt, otherwise you must decide who adds them (either yourself or the software), but you need to make sure they are added correctly and that the whitespaces are present.

https://github.com/huggingface/blog/blob/main/mixtral.md

samparksoftwares

Dec 29, 2023

I have attributes like system prompt, chat history, context, and question can you help me with how to customize the prompt to get complete answers.

Thireus

Dec 29, 2023

I don't believe this model supports system prompt, but you can certainly include it in the [INST] [/INST]. Chat history is just a succession of [INST] and answers as I've demonstrated in my previous post.

samparksoftwares

Jan 2, 2024

•

edited Jan 2, 2024

I have been using this format:
B_INST, E_INST = "[INST]","[/INST]"
SYSTEM_PROMPT = B_SYS + prompt + E_SYS
instruction = """
Context:{chat_history} \n {context}
User:{question}"""

prompt_template = B_INST + SYSTEM_PROMPT + instruction + E_INST

this format is not giving complete answer and if you can help me correct this format that will be very helpful.

ibndias

Jan 31, 2024

Same here, even with the correct prompt format it stops generating at some code or command line.

samparksoftwares

Feb 7, 2024

Hi everyone,
After doing hit and try methods I was able to get whole answer from mixtral model by using same prompt I have mentioned above. Just by calling llm from HuggingFaceInferenceAPI from llama index or langchain.

Thireus

Apr 18, 2024

•

edited Apr 18, 2024

Proof of concept to construct valid Mixtral prompts using Python:

Set up a Python environment:

virtualenv-3.10 --python=python3.10 ~/test
cd test
source bin/activate*
pip install mistral-common transformers jinja2

Create a file named modified_script.py (modified Python script from https://huggingface.co/mistralai/Mixtral-8x22B-Instruct-v0.1):

from mistral_common.protocol.instruct.messages import (
    AssistantMessage,
    UserMessage,
)
from mistral_common.tokens.tokenizers.mistral import MistralTokenizer
from mistral_common.tokens.instruct.normalize import ChatCompletionRequest

from transformers import AutoTokenizer

tokenizer_v3 = MistralTokenizer.v3()

mistral_query = ChatCompletionRequest(
    messages=[
        UserMessage(content="How many experts ?"),
        AssistantMessage(content="8"),
        UserMessage(content="How big ?"),
        AssistantMessage(content="22B"),
        UserMessage(content="Noice 🎉 !"),
    ],
    model="test",
)
hf_messages = mistral_query.model_dump()['messages']

tokenizer_hf = AutoTokenizer.from_pretrained('mistralai/Mixtral-8x22B-Instruct-v0.1')

print(tokenizer_hf.apply_chat_template(hf_messages, tokenize=False))

Execute the script:

python modified_script.py

Output of the above command:

<s> [INST] How many experts ? [/INST] 8 </s> [INST] How big ? [/INST] 22B </s> [INST] Noice 🎉 ! [/INST]

Hope this helps.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment