Gibberish / Sliding Window

#1
by zappa2005 - opened

@senseable
I really liked seeing a RP focussed, smaller model, so I tried the 8bpw on a 4080. I noticed that after 8k context, it railed off and only produces Gibberish although the config.json shows 32k position embeddings.

I remember reading somewhere that the initial 7B Mistral models based on 0.1 had some sliding window awareness issues, and you could spot them if their config.json has a sliding window defined (also the case here with 4096).
The Mistral-7B-v0.2 changed something in that regard, and you can also see their config.json sets sliding window to null instead.

Can you reproduce, or is this something on my end? I'm using oogabooga with the ExLlamav2_HF loader.

Yup, after 8k-10k context, the model goes off the rails. I'm using exui with raw exllamav2 and am seeing the same thing. I see the same behavior in the base, unquantized fp16 model as well, so the issue is inherent to the model itself.

Sign up or log in to comment