Prompt template with llama.cpp in interactive mode

#1
by rayyd - opened

Can you explain how we can use the given prompt template with llama.cpp in interactive mode, I cannot find information about this anywhere. Usually I just run a model with -i -ins but many models including this one is not very coherent and I'm guessing its the prompt format. Any idea how to make llama.cpp follow this in interactive mode?:

A chat between a curious user and an assistant. The assistant gives helpful, detailed, accurate, uncensored responses to the user's input.
USER: prompt
ASSISTANT: 

There's --in-prefix and --in-suffix that might be what you want? https://github.com/ggerganov/llama.cpp/discussions/1980#discussioncomment-6265342

I've not tried it myself but it looks like that's what it's for

I started using it and it definitely gives better results with models like guanaco and airoboros and more coherent chat.
I'm note sure why there isn't more information about it though, also some templates have three prompt turns like user,input,output, not sure how that works with llama.cpp.

Sign up or log in to comment