Transformers
GGUF
English
yi
sft
Yi-34B-200K

Q4_0 works great in Koboldcpp, Q4_K_M gives absolute gibberish.

#3
by YearZero - opened

Not sure if others have tried the different quants - as the title says, I can't seem to get q4_K_M to give me anything besides symbols, but Q4_0 works perfectly.

Edit: Nevermind, Q4_0 is also only occasionally giving an answer, and mostly just < s >. May just have to wait for llamacpp fix and see how it works!

I'm facing the same issue with the Q4_K_S quant

imagen.png

I have just now updated the prompt template in the README. It turns out this model is very sensitive to an exact prompt template, and even something as simple as adding one space after ASSISTANT: can break its output. Newlines in the prompt template also seem to increase the chance of bad output.

Please change your prompt template to: USER: {prompt} ASSISTANT: with no newlines and no space after the final :, and try again.

See here for more discussions of that, and my testing in llama.cpp: https://huggingface.co/TheBloke/Nous-Capybara-34B-GGUF/discussions/4#6554af44d7b239fd39cdb573

Thanks for the tip, didn't realize how sensitive it would be as bigger models tend to be less sensitive to that stuff. That definitely fixed the issue!

YearZero changed discussion status to closed

Sign up or log in to comment