I have serious repeat problem.

by jackboot - opened Sep 26, 2023

Sep 26, 2023

•

edited Sep 26, 2023

Model starts strong and works well at first. As context builds up, it begins repeating certain past phrases in every reply. I have tried with mirostat and traditional sampling.

Using 4.625 bpw GPTQ quant so it's not a lack of bits. Happens with or without rope. Using both alpaca and vicuna instruction templates.

nbl97

Xwin-LM org Sep 26, 2023

@jackboot Hi~ Could you share some cases for reproduction?

Gershwin69

Sep 26, 2023

•

edited Sep 26, 2023

I also have experienced this. An extremely good start using conversation inputs, I'd say almost some of the best I've had using a local model. But it reached a point (towards context max) where it absolutely wouldn't respond with anything except prior inputs, sometimes very slightly rephrased, but not really. I did eventually get it to provide new output, but only by (a) Massively increasing allowed token generation size and (b) Removing most stop words. What it then did was repeat a couple of prior outputs and THEN generated something new. I do not have the example available, sorry.

jackboot

Sep 26, 2023

To reproduce it simply have a conversation and reach high context, 3500 or so. Any backend that uses character prompts for roleplay ought to do it.

If you are asking oneshot questions the problem is not possible to see easily, there is no continuity. For example in a chat, the model ends multiple messages like this after a certain point:

P.S. I'm grateful for your understanding and openness to discuss this concept with me, and I'll do my best to keep the conversation engaging while avoiding repetition.
P.S. I apologize for the repetition earlier. I'll make sure to keep my responses diverse and engaging from now on.
P.S. I appreciate your understanding about the repetition earlier, and I'll do my best to avoid it in the future.
P.S. I'm glad you're open to exploring this idea with me, and I'll do my best to avoid repetition and keep the conversation engaging.
P.S. I'm grateful for your understanding and openness to discuss this concept with me, and I'll do my best to keep the conversation engaging while avoiding repetition.
P.S. I appreciate your openness in discussing this unique experience with me, and I'll do my best to keep the conversation engaging while avoiding repetition.
P.S. I appreciate your trust in sharing this story with me, and I'll do my best to keep the conversation engaging while avoiding repetition.

And it just keeps going like that. I was experimenting, trying everything to get it to stop, changing sampling etc. You will get some new output but a section of the old still there and every message has more and more old.

Gershwin69

Sep 26, 2023

I also find that no matter what I tell it, when having a conversational interaction, the responses get longer and longer, which is often not ideal due to context size.

Yhyu13

Sep 27, 2023

I also find that no matter what I tell it, when having a conversational interaction, the responses get longer and longer, which is often not ideal due to context size.

It may be a RLHF-ed effort that prolong conversations over time by being more and more chatty? I don't know

Gershwin69

Sep 27, 2023

I also find that no matter what I tell it, when having a conversational interaction, the responses get longer and longer, which is often not ideal due to context size.

It may be a RLHF-ed effort that prolong conversations over time by being more and more chatty? I don't know

It might well be, but it becomes a couple of paragraphs after a while which is just too much! It was generating 30% of the context size in reach response.

jackboot

Sep 27, 2023

I have the context headroom, but if it's just going to output parts of the old messages, that isn't helpful or engaging.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment