Proper prompt?

#3
by damarges - opened

What is the proper prompt to use this model?

Looking at the tokenizer used for the qlora:

https://huggingface.co/ShinojiResearch/Senku-70B/blob/main/tokenizer_config.json

It looks to be the same as the original Mistral/Mixtral/Miqu prompt template:

https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2

eg:

Instruction format
In order to leverage instruction fine-tuning, your prompt should be surrounded by [INST] and [/INST] tokens. The very first instruction should begin with a begin of sentence id. The next instructions should not. The assistant generation will be ended by the end-of-sentence token id.

text = "<s>[INST] What is your favourite condiment? [/INST]"
"Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!</s> "
"[INST] Do you have mayonnaise recipes? [/INST]"

Or in Ollama template format:

TEMPLATE """[INST] {{ if and .First .System }}{{ .System }} {{ end }}{{ .Prompt }} [/INST]{{ .Response }}"""

The wrapped llama.cpp server adds the leading <s> token AFAIK and from experiments with Mixtral; it's important to not add a space after the closing [/INST] tag.

Actually it might not be this if you look at this discussion from the fp16 dequant this model was fine-tuned off:

https://huggingface.co/152334H/miqu-1-70b-sf/discussions/11

Although @jackboot does mention about not adding a space after the closing [/INST] tag, which makes me think it is the same as the original Mistral/Mixtral/Miqu prompt template:

I have been fighting with it being " [/INST] \n" or just " [/INST]\n" and now "[/INST]REPLY</.s>" The model will reply about the same to all of them, but using the more "correct" one leads to less commentary or NOTES:

Shinoji Research org
β€’
edited Feb 9, 2024

No intentional change was made. I have found that ChatML also works reasonably well, and interestingly when asked the who are you questions forgets it is a mistral model.

Most of the testing I did was via the defualt recognized one on Oogabooga (which should be mistral).

Possible a bit more finetuning could fix any weirdness with the prompting?

Edit:Mixed, I am recommending ChatML from this point on.

Avoids the bug that Miqu SF has and scores slightly higher on EQ bench, slower on GSM8k. I will consider a v2 where this issue is corrected.

Shinoji Research org

If anyone has any suggestions on further finetuning that could improve / correct this, would be happy to attempt it.

I used chatml for the original mixtral-instruct.

Shinoji Research org

ChatML seems to perform better in EQ-Bench, I confirmed such thing even with the reproduced experiments.
You can reproduce the training with chatml as template and adding the start and end tokens.

tokens: # these are delimiters
- "<|im_start|>"
- "<|im_end|>"
chat_template: chatml

Sign up or log in to comment