[Bug Report] <0x0A> is output instead of a newline

#1
by Maxxim69 - opened

The character sequence <0x0A> is output every time instead of a newline.
This redditor suggests there may be a problem with the tokenizer in the original non-GGUF model that carried over.

Are you using LM Studio? I saw something similar but I bet it has to do with the Preset you have chosen.

It happens using Ollama as well -- the model is outputting the token for a newline, but it's not interpreted as such.
Could be solved programatically if you're using this model in a server environment with its outputs passed to a program, otherwise - I'm not sure of another solution.

I have the same bug. I tried on few versions of koboldCpp, in their KoboldLite front-end, in SillyTavern with different chat templates, but bug stays. For me this is the best Mixtral model i tried, even better that x8 moe models. Its really good in staying in character, speech style etc.

same as above.

I have the same issue. I think this problem about the original model(Mixtral 7Bx2 MoE) missing tokenizer.model file.
Here is how i fix:

  1. git clone https://huggingface.co/cloudyu/Mixtral_7Bx2_MoE
  2. cd Mixtral_7Bx2_MoE && curl -L -O https://huggingface.co/mistralai/Mixtral-8x7B-v0.1/resolve/main/tokenizer.model
  3. use llama.cpp reconvert model python convert.py ../Mixtral_7Bx2_MoE
  4. ./quantize ../Mixtral_7Bx2_MoE/ggml-model-f16.gguf ../Mixtral_7Bx2_MoE/ggml-model-q4_K_M.gguf q4_K_M

I can't load this model by ctranformers

Sign up or log in to comment