Update prompt format
Based on discussion #3 and this GIST https://gist.github.com/the-crypt-keeper/8d781a12ee515903edc89ef69383570f update the chat template to match https://github.com/facebookresearch/llama/blob/main/llama/generation.py#L213
Hi
@mike-ravkine
, did you see this: https://github.com/ggerganov/llama.cpp/issues/2262#issuecomment-1641323686 ?
The difference I'm getting is the <s>
and </s>
tokens.
Thanks mike. I've already updated them with the general thing, but not with the EOS/BOS.
You PR'd this to a GGML repo specifically - does that mean this is needed for GGML only, not GPTQ? Or all, do you think?
Have you confirmed that makes a practical difference?
This should apply equally to GPTQ. The things that look like special tokens here are not actually special tokens, its just been finetuned out the wazoo with this very specific template:
System: Respond with the main characters of the given movie. User: Guardians of the Galaxy Assistant: *Star-Lord (Peter Quill) *Gamora *Drax the Destroyer *Rocket Raccoon *Groot *Baby Groot Please let me know if you need any additional information or clarification. [end of text]
vs
<s>[INST]<<SYS>>Respond with the main characters of the given movie<</SYS>> Guardians of the Galaxy[/INST] Sure! Here are the main characters of the movie Guardians of the Galaxy:
1. Peter Quill / Star-Lord (played by Chris Pratt): A human who was abducted by aliens as a child and grew up to be a skilled thief and smuggler. He is the leader of the Guardians.
2. Gamora (played by Zoe Saldana): A green-skinned alien who is a skilled warrior and assassin. She is the adopted daughter of Thanos, the mad Titan who seeks to conquer the universe.
3. Drax the Destroyer (played by Dave Bautista): A powerful warrior who seeks revenge against Ronan the Accuser for killing his family. He is a humanoid with cybernetic implants and enhanced strength.
4. Rocket Raccoon (voiced by Bradley Cooper): A genetically engineered raccoon who is a skilled fighter and weapons expert. He is sarcastic and has a troubled past.
5. Groot (voiced by Vin Diesel): A tree-like humanoid who can control plants and has superhuman strength. He is able to say only three words: "I am Groot."
6. Mantis (played by Pom Klementieff): A sensitive and empathetic alien who can sense the emotions of others. She serves as a companion and confidant to Peter Quill. [end of text]
The leading <s>
doesn't seem to make any difference in the few single-turn prompts I've run but I have no idea if those </s><s>
pairs get more relevant once you get deeper into a conversation.
@mike-ravkine
When you mention <s>
and </s>
you really want to mean the strings <s>
and </s>
or just an alias for BOS and EOS?
I'm doing some tests with the original tokenizer and it seems that BOS and EOS are not "printable" characters. Indeed, the tokenizer.decode
method outputs an empty string.
@viniciusarruda these are the encoder versions of BOS and EOS, token ids 1 and 2. They decode to nothing yes, but the model is still trained on them.
+1
@TheBloke
I've removed the leading <s>
based on https://huggingface.co/spaces/huggingface-projects/llama-2-13b-chat/blob/main/model.py#L24 not having it. Note that it does however have the </s><s>
pair in between conversation turns - I expect this is required for the 'chat' to work correctly.
Thank you, merged