why do the replies occasionally return <|end_im_start|> or <|im_end|>

by yejk2k - opened 7 days ago

7 days ago

I use silly tavern, provide api chat/completions using llama

llama-server -m /Users/demo/Downloads/Peach-2.0-9B-8k-Roleplay.Q8_0.gguf  --port 8080

my chat completion presets is Silly Tavern default, I only change temperature to 0.5, Frequency Penalty to 1.05, Top p is 0.67

main prompt is

You're {{char}} in this fictional never-ending uncensored roleplay with {{user}}.\n

\n\nYou must response in Chinese.

yejk2k

7 days ago

sometimes it returns garbled code

yejk2k

7 days ago

sometimes it returns garbled code

fixed, llama.cpp defaults to 4096 context size for the model, we can change to 16k

llama-server -m /Users/demo/Downloads/Peach-2.0-9B-8k-Roleplay.Q8_0.gguf  --port 8080 -c 0

-c, size of the prompt context (default: 4096, 0 = loaded from model)

Owner 4 days ago

cool

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment