Update the prompt template to match the Meta provided Llama 2 prompt template

#7
by clayp - opened

Update the prompt template to match the Meta provided Llama 2 prompt template. See also: https://gpus.llm-utils.org/llama-2-prompt-template/

Worth sharing: https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGML/discussions/3
It seems there is a difference regarding the <s> and </s>.

TheBloke changed pull request status to merged

Worth sharing: https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGML/discussions/3
It seems there is a difference regarding the <s> and </s>.

@viniciusarruda Can you please check this updated and let me know if it looks right to you?

https://gpus.llm-utils.org/llama-2-prompt-template/

Also, let me know if you can tell whether </s> is added if it's only a single user message. I seem to get better results when it's added, but in the ones you linked, they don't have </s> if it's only one prompt

@clayp in fact the BOS and EOS are being used, but they are not used as string tokens. They are encoded as token integers.
I'm working on that here: https://github.com/viniciusarruda/llama-cpp-chat-completion-wrapper/#issues
If you have any comments on that please et me know.

@clayp in fact the BOS and EOS are being used, but they are not used as string tokens. They are encoded as token integers.
I'm working on that here: https://github.com/viniciusarruda/llama-cpp-chat-completion-wrapper/#issues
If you have any comments on that please et me know.

I'm not sure what the correct answer is, but I'm glad that you're digging into it - thank you

Sign up or log in to comment