Mixtral 8x22B mixing up syllables

#35
by Stefanvarunix - opened

Has anyone experienced this:

I get weird typos in German, e.g. vowels are typed twice ("ii" instead "i", or "aa" instead "a"), mixing up syllables up to writing gibberish, mainly mixing syllables and letters. The longer the chat (e.g. the context), the worse it gets. In the beginning (e.g. first chat responses), it seems ok.

I downloaded and tested models from https://huggingface.co/MaziyarPanahi (different quants) and https://huggingface.co/mradermacher/Mixtral-8x22B-Instruct-v0.1-GGUF, did GGUF generation and quantization myself etc.
Nothing helped.

The inferencing itself works and is quite fast (Apple M1 Ultra 128 GB 64C GPU).

I use the latest llama.cpp server and API (./server -m ...), chat via Panel ChatInterface, accessing the llama.cpp HTTP API with OpenAI's python library.

Is this maybe a problem with tokenizers or chat templates?

Yes, apparently the template is very sensitive (as most models coming out of mistralai). Once space here and there, you will get those vowels. (I was told on Twitter, and then they fixed the prompt and said it got much better)

Sign up or log in to comment