It keeps spamming "assistant" at the end of sentences and then makes long rambling replies to itself

#2
by 6346y9uey - opened

Stopping string issue

Owner

I've seen "assistant" and repeat itself with llama3 quantized exl2 models with tabbyAPI.
Wait for downstream bug fixes for other UIs.

latest main results: main: build = 2709 (40f74e4d)

F16

== Running in interactive mode. ==
 - Press Ctrl+C to interject at any time.
 - Press Return to return control to LLaMa.
 - To return control without starting a new line, end your input with '/'.
 - If you want to submit another line, end your input with '\'.

<|begin_of_text|>
> repeat this: I live. 
I live.<|eot_id|>

> again.
I live.<|eot_id|>

>

IQ4_XS

<|begin_of_text|>
> repeat this once: I like soccer.
I like soccer.<|eot_id|>

> 

from my experience with some 70b l3 gguf quants, such issues disappear when using the full format with user/assistant format, which meta states is important for good results:

<|begin_of_text|><|start_header_id|>system<|end_header_id|>

{{ system_prompt }}<|eot_id|><|start_header_id|>user<|end_header_id|>

{{ user_message }}<|eot_id|><|start_header_id|>assistant<|end_header_id|>

absence or lack of new lines is also important. double newlines are not necessary ime, but that's how meta specified it

Sign up or log in to comment