The model is great but then suddenly writes weird

#1
by Flickaboo - opened

Hello,
I had some honest fun with this model, it got my characters quite right at the start for about the first 30 messages. It write coherent and continues the story while describing it's thoughts, actions etc. But then after the first 30 messages something happens and it chooses random words and becomes suddenly all flowery with its descriptions and some sentences are continuously repeated in each message. From beautiful coherent writing to suddenly writing weird, as if the entire style changed mid chat, even though the previous chat history did not show any such writing. It messed up my story and I am not sure what to do... I tried multiple changes to temperature, rep penalty, min_p etc. But it never recovers after that and even the dialogue becomes less because of the flowery description talk. (Edit: it's like it suddenly changes style and writes all poetic and lyrical, without any reason...)
Any idea, how to fix this issue? Now, I do use ooba with 12.288 context with 4 bit cache.

This is probably best addressed by the original model author as I only generate quants for these models. At a guess, the issue is that the model is going beyond it's trained context size, so that will suffer. I've seen catastrophic failures like this when models go beyond their trained limits. I'm not sure what the model will support in terms of fixes here; try playing with rope or alpha settings normally reserved for extending context size. The 10.7B models are Solar-based, so search for fixes for these class of models.

Hi LoneStriker! Thank you for your answer. Do you by any chance know the trained context size for this model? Because you might be correct that it happened when the total context is used and about 11k are in total context as per the prompt on Silly Tavern. Before that it seems to behave normal. I set Alpha to 5.5 on Ooba, not sure if that is the correct number. But if the actual trained context is lower, I might need to lower my context as well, even though I was hoping with the exl I could finally use more context, as this model is fast enough on my 12GB RAM GPU.

Sign up or log in to comment