Repetition Issues

#1
by Luxxanna - opened

First off, I'm just getting started with messing around with the model, but the prose and reasoning so far have been fantastic! However, I'm almost certain there is something I'm doing wrong on my side. For whatever reason, and I have never had any of these issues with previous models outside of Gemma2 one's, the model will not regenerate properly. It cycles through the same 2-3 responses. It doesn't matter what settings I change in regards to samplers OR templates, it will always spit back out the exact same sentence. It's acting like I have my temperature set to 0 when I don't.

It's currently late for me, but I haven't explored every avenue. I guess I'm curious if you guys have stumbled upon anything like this on any other models? I've tried asking for assistance regarding this issue with Gemma2 models but never got any adequate responses.

Thanks for the feedback! I'm glad you like the prose, that's one of the things we really focused on this time around. Reasoning actually wasn't too spectacular when I tried it, but I'm glad it's working for you.

Strangely enough, I haven't experienced the kind of repetitiveness you're describing here, and from the feedback we got while testing this model, regens were all over the place in terms of quality and variety. What you're saying does sound like a temp 0 thing, but since you said you've played around with samplers, I'm not sure what's going on. I'm currently using the static q8 GGUF through KoboldCPP with the following samplers:

image.png

Everything else is disabled/neutralized.

Out of curiosity, do you experience a similar problem with Mullein v0? The datasets we used across v0 and v1 were pretty similar, but their training hyperparameters were a bit different.

Hey! Apologies for the radio silence. It's been an unexpectedly busy (and exhausting) week and a half or so. I haven't gotten around to re-testing any of the models due to lack of time. I'm hoping that this weekend I'll have time to mess around with the models again, but no promises!

To provide a little bit of extra information that I didn't in my original comment; I used the static Q6_K quant with KoboldCPP as the backend and SillyTavern as the frontend. I'll see if I can run Q8 (I most likely can with nothing else open on my system). I did give Mullein v0 Instruct a try a little while back. I think the main issues I ran into revolved around how quickly it pushed into NSFW at the slightest "hint" of it (even something as simple as a kiss) and I think it had a tendency to just... keep going (take the criticism with a grain of salt considering the situation I'm dealing with).

For the time being, I'll send some screenshots of my SillyTavern settings, just to see if there's anything that's obviously out of place that I've somehow missed (it happens an embarrassingly high amount):

image.png

image.png

image.png

image.png

If I do any testing on the weekend, I might also give Ooba a try, just to see if it's a KoboldCPP issue related to my backend settings (I've admittedly been incredibly lazy and haven't touched them since installing the backend).

trashpanda org

Thanks for the reply, and no worries. I think I see the problem--you have Top_P set at 0.2 whereas I think you meant to set Top_A at 0.2 instead. Top_P at 0.2 is going to cut off a lot of tokens, way more than Top_A at 0.2--afaik it'll have almost the same effect as setting Temp at 0. So try setting Top_P 1 and Top_A 0.2--if responses are a bit too crazy or unhinged, you can try throwing a little bit of Top_P back in there, but definitely not 0.2 lol.

The thing about Mullein-v0 being a little too NSFW-inclined also definitely isn't a you thing--a lot of testers noticed that.

Sign up or log in to comment