why do the replies occasionally return <|end_im_start|> or <|im_end|>

#2
by yejk2k - opened

I use silly tavern, provide api chat/completions using llama

llama-server -m /Users/demo/Downloads/Peach-2.0-9B-8k-Roleplay.Q8_0.gguf  --port 8080

my chat completion presets is Silly Tavern default, I only change temperature to 0.5, Frequency Penalty to 1.05, Top p is 0.67

main prompt is

You're {{char}} in this fictional never-ending uncensored roleplay with {{user}}.\n

\n\nYou must response in Chinese.

sometimes it returns garbled code

image.png

sometimes it returns garbled code

image.png

fixed, llama.cpp defaults to 4096 context size for the model, we can change to 16k

llama-server -m /Users/demo/Downloads/Peach-2.0-9B-8k-Roleplay.Q8_0.gguf  --port 8080 -c 0

-c, size of the prompt context (default: 4096, 0 = loaded from model)

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment