Please add prompt template to Readme for gguf.

Thank you for this model!

I was wondering what the prompt template is?

-p "Hello" makes it only go into completion mode, unfortunately.

01-ai org

check tokenizer config, standard chatML format

should be chatml, but from the config, it looks awkward with regard to the system prompt, maybe they meant it like:

<|startoftext|>You are a helpful, polite AI assistant.<|im_end|>
What is the meaning of life?<|im_end|>

Something might be wrong with either tokenizer, or llama.cpp - "<|im_end|> " is being displayed as text during the chat:


  1. --outtype f16 ..\Yi-1.5-9B-Chat\ --outfile Yi-1.5-9B-Chat-F16.gguf
  2. quantize Yi-1.5-9B-Chat-F16.gguf Yi-1.5-9B-Chat-Q6_K.gguf Q6_K
  3. server -v -ngl 99 -m Yi-1.5-9B-Chat-Q6_K.gguf -c 4096
  4. http://localhost:8080/, changed user name to "user", bot name to "assistant", prompt to "You're a helpful assistant.".


GGUF and test made using current-ish llama.cpp (b2859).

UPDATE: using different name than assistant doesn't cause this problem:


