I have serious repeat problem.

#7
by jackboot - opened

Model starts strong and works well at first. As context builds up, it begins repeating certain past phrases in every reply. I have tried with mirostat and traditional sampling.

Using 4.625 bpw GPTQ quant so it's not a lack of bits. Happens with or without rope. Using both alpaca and vicuna instruction templates.

Xwin-LM org

@jackboot Hi~ Could you share some cases for reproduction?

I also have experienced this. An extremely good start using conversation inputs, I'd say almost some of the best I've had using a local model. But it reached a point (towards context max) where it absolutely wouldn't respond with anything except prior inputs, sometimes very slightly rephrased, but not really. I did eventually get it to provide new output, but only by (a) Massively increasing allowed token generation size and (b) Removing most stop words. What it then did was repeat a couple of prior outputs and THEN generated something new. I do not have the example available, sorry.

To reproduce it simply have a conversation and reach high context, 3500 or so. Any backend that uses character prompts for roleplay ought to do it.

If you are asking oneshot questions the problem is not possible to see easily, there is no continuity. For example in a chat, the model ends multiple messages like this after a certain point:

P.S. I'm grateful for your understanding and openness to discuss this concept with me, and I'll do my best to keep the conversation engaging while avoiding repetition.
P.S. I apologize for the repetition earlier. I'll make sure to keep my responses diverse and engaging from now on.
P.S. I appreciate your understanding about the repetition earlier, and I'll do my best to avoid it in the future.
P.S. I'm glad you're open to exploring this idea with me, and I'll do my best to avoid repetition and keep the conversation engaging.
P.S. I'm grateful for your understanding and openness to discuss this concept with me, and I'll do my best to keep the conversation engaging while avoiding repetition.
P.S. I appreciate your openness in discussing this unique experience with me, and I'll do my best to keep the conversation engaging while avoiding repetition.
P.S. I appreciate your trust in sharing this story with me, and I'll do my best to keep the conversation engaging while avoiding repetition.

And it just keeps going like that. I was experimenting, trying everything to get it to stop, changing sampling etc. You will get some new output but a section of the old still there and every message has more and more old.

I also find that no matter what I tell it, when having a conversational interaction, the responses get longer and longer, which is often not ideal due to context size.

I also find that no matter what I tell it, when having a conversational interaction, the responses get longer and longer, which is often not ideal due to context size.

It may be a RLHF-ed effort that prolong conversations over time by being more and more chatty? I don't know

I also find that no matter what I tell it, when having a conversational interaction, the responses get longer and longer, which is often not ideal due to context size.

It may be a RLHF-ed effort that prolong conversations over time by being more and more chatty? I don't know

It might well be, but it becomes a couple of paragraphs after a while which is just too much! It was generating 30% of the context size in reach response.

I have the context headroom, but if it's just going to output parts of the old messages, that isn't helpful or engaging.

Sign up or log in to comment