Prompt Format
Does this use the same prompt format as the original llama 3 instruct model?
Hello, Bumping this question because I am also interested in the answer.
same, seeing periodic garbage coming from the model and wondering if it is the prompt format.
I've had this question so many times with the various models. I finally realized when loading a GGUF model with llama.cpp it prints out the prompt format / chat template in the logs after loading. Digging around I realized many (?most?) models have it in the tokenizer_config.json
of the base model repo e.g.:
"chat_template": "{% set loop_messages = messages %}{% for message in loop_messages %}{% set content = '<|start_header_id|>' + message['role'] + '<|end_header_id|>\n\n'+ message['content'] | trim + '<|eot_id|>' %}{% if loop.index0 == 0 %}{% set content = bos_token + content %}{% endif %}{{ content }}{% endfor %}{% if add_generation_prompt %}{{ '<|start_header_id|>assistant<|end_header_id|>\n\n' }}{% endif %}",
This appears to match the "normal Llama-3-70B" template that I've seen and used successfully on other models using this one as the base:
<|start_header_id|>system<|end_header_id|>\n\nYou are a helpful assistant<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nHello<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\nHi there<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nHow are you?<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n"
I've tested this on mradermacher/Llama-3-70b-Arimas-story-RP-V1.5-i1-GGUF/Llama-3-70b-Arimas-story-RP-V1.5.i1-IQ3_XXS.gguf
which uses this model as a base. Without this prompt format it generates a lot of continuations and pretends to be the user role and includes a lot of ---
breaks. However, using the correct prompt format shown above prevents it from rambling on or acting out of turn generating garbage.
Hopefully this helps someone, Good luck!
Yes, we use the same prompt format as the base llama 3 instruct models, https://github.com/meta-llama/llama-recipes?tab=readme-ov-file#llama-recipes-examples-to-get-started-using-the-llama-models-from-meta