Sometimes getting "<dummy00019>" in the output

#7
by MoonRide - opened

Seems to be pretty flexible model, but sometimes I get "<dummy00019>" in generated text, out of the blue. It happened to me once when using this model via Ollama, and another one via koboldcpp. Anyone else experienced that? I am using using Q6_K variant.

I've occassionally seen that too using llama.cpp

Same here, "< dummy00022 >" randomly 7B Q8_0 from NousResearch

That's usually a sign of the vocab getting padded incorrectly during quanting, can you try mine and see if you get the same behaviour?

https://huggingface.co/bartowski/Hermes-2-Pro-Mistral-7B-GGUF

Although that said based on a PR being opened to pad the vocab it seems more likely that we're missing added tokens on the main model

@bartowski I tried converting the version from PR myself, and I didn't observe this probem, then. If you were doing GGUFs for SOLAR-based variant, there is similar fix for it, here: https://huggingface.co/NousResearch/Nous-Hermes-2-SOLAR-10.7B/discussions/7.

That's usually a sign of the vocab getting padded incorrectly during quanting, can you try mine and see if you get the same behaviour?

@bartowski Yes, I tried 4_K_S and had the same issue. Actually, I think I've tried every hermes 2 pro 7b gguf on huggingface over the past couple weeks. I like this model, but I dislike dummy tokens.

Sign up or log in to comment