Prompt format different in dataset compared to model card

#11

by bhperry - opened Sep 19, 2023

Sep 19, 2023

In the model card it says the expected format is [INST]\n<text>\n[/INST]\n\n however after looking at the dataset and the code it appears that the actual format it was trained on was [INST] <text> [/INST]

Is the model card incorrect or is there something I'm missing? It seems like it was never trained on the \n version of the prompt. Sort of works with the \n variant, but I seem to be getting better consistency and instruction following with the variant found in code/dataset.

zeekeez

Sep 22, 2023

bump

mauriceweber

Together org Sep 26, 2023

Hi @bhperry , great catch! You are right, during training we used [INST] <text> [/INST], but we found that adding two newline characters during inference works best. If in your case however, omitting the newlines works best, then you can safely use the template without the newline characters.

bhperry

Sep 26, 2023

•

edited Sep 26, 2023

Interesting... thanks! After more experimentation, I seem to be getting the best results with a mix of the two formats. [INST] <text> [/INST]\n\n

bhperry changed discussion status to closed Sep 27, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment