Text Generation
Transformers
Safetensors
English
llama
conversational
Inference Endpoints
text-generation-inference

Does this model support multi-turn conversations?

#18
by apepkuss79 - opened

If the model supports multi-turn conversations, what does the prompt string look like? Could you please provide an example? Thanks a lot! Happy New Year!

The default llama.cpp server command's webserver's default settings work perfectly for chat, and this model is really good at chat and one of the cleanest ("safe in a non-obnoxious way") bots out, even at high quantization. (Note you have to use GGUF files for llama.cpp, quantize them with the bundled quantize command or just download them from TheBloke's repo)

@dagelf Thanks for the reply. We've already supported this model in llama-api-server and llama-chat.

apepkuss79 changed discussion status to closed

Sign up or log in to comment