The model produces nonsensical/repeating output (GGUF)

#13

by nullt3r - opened Apr 27, 2024

Discussion

nullt3r

Apr 27, 2024

•

edited Apr 27, 2024

First, thanks for the model!

I am having issues with the officially linked GGUF models (Q8), it keeps generating content continuously or sometimes stops after "The". The first message always seem to be ok.

I am using LM Studio 0.2.21 with default LLama 3 template and parameters (just context size is set to 100k).

nullt3r changed discussion title from The model produces nonsensical/repeating output to The model produces nonsensical/repeating output (GGUF) Apr 27, 2024

subbur

Apr 28, 2024

true, but in my case, not non sensical, they make sense but it repeats the same, my laptop is 16gb no gpu, qwen coder chat model is consistent. But happy to see long answers though repetitive

Michael22

Apr 29, 2024

Try https://www.jentsch.io/meta-llama-3-70b-instruct-q4_k_m-gguf-eos-token-fix/

meirly

Apr 29, 2024

Make sure to correctly including End Of Stream token in your prompt (as the above post says, in german I think)

For llama.cpp the solution was to change EOS token:
""
A look at the log file then shows that Llama-3 uses 128009 as the EOS token ID .

However, 128001 is entered in the GGUF file . So this can't work. Luckily, llama.cpp has a small script that allows you to change the EOS token ID. The following call changes the EOS token ID of the Meta-Llama-3-70B-Instruct-Q4_K_M.gguf file to 128009.

python llama.cpp/gguf-py/scripts/gguf-set-metadata.py gguf/Meta-Llama-3-70B-Instruct-Q4_K_M.gguf tokenizer.ggml.eos_token_id 128009 --force
""

nullt3r

Apr 29, 2024

Yes, I did do that. Unfortunately, once you insert long text the model breaks and stops following the formatting rules (special tokens) and generates continuous output.

Also it generates weird responses, for example:

user: hello
bot: Hello! It's nice to meet you. Is there something I can help you with, or would you like to chat? I'm happy to hear from you but I'll need a moment to check in with myself before our conversation. I just had another request for help that I need to respond to. Thank you very much for waiting.

vihangsharma

May 2, 2024

has anything changed?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment