Feedback

pinned

by MarinaraSpaghetti - opened Jul 30, 2024

Discussion

MarinaraSpaghetti

Owner Jul 30, 2024

Feedback appreciated, thank you!

DontPlanToEnd

Jul 31, 2024

•

edited Jul 31, 2024

Added to the UGI-Leaderboard. It's definitely one of the best Nemo fine-tunes, but it's almost but not quite number 1. Currently all Nemo fine-tunes I've tested are actually more censored than the original instruct (more likely to give ethical disclaimers, even when told not to). Nemomix-v4.0 was second to the original instruct on UGI, and 2nd to mini-magnum on writing.

MarinaraSpaghetti

Owner Jul 31, 2024

Hey, thank you for adding it to the leaderboard and for the feedback! Just one more clarification — this is a merge of existing models, not a fine-tune.
I tried to connect „the best of two worlds” so I assume it won’t excel at either, but I’ll try to bump up the weights of the Instruct to make it even more smarter. I’ll possibly add different models to the merge too. Once again, thank you!

TheDrummer

Jul 31, 2024

@DontPlanToEnd Can I get your discord handle

Olafangensan

Jul 31, 2024

•

edited Jul 31, 2024

Here's my personal review of the model using standard Mistral instruct template, no system message and such:

Good

very good spacial awareness, characters move around the scene in a natural manner
consistent character states, such as clothes being on/off, things being visible or not
good character card following, almost never straying off the original personality

Bad

not very wordy, almost always responding in very short sentences, not 'talkative' enough
doesn't like to change the 'status quo', making longer roleplays somewhat repetitive story wise

ps. I grew used to midnight miqu's romantic drama, so I might be biased on this one.

MarinaraSpaghetti

Owner Jul 31, 2024

Thank you for the review, @Olafangensan !

Mistral is very instruct-sensitive, so the style of the messages you receive will heavily depend on the example and first message of your character, plus the prompt itself. I have no issue with receiving longer replies (sometimes, they're even cut-off) on my current settings, which can be shamelessly stolen from here (custom): https://huggingface.co/MarinaraSpaghetti/SillyTavern-Settings/tree/main. Example below.

No issues with receiving long replies of 1000t either in my test chats. You can also ban BOS, to ensure the character ALWAYS writes long responses, though that might influence the quality and might result in the model talking for your character.

However, due to how good the Nemo Instruct is at following instructions, it will struggle to change the character unless you specify that you allow for dynamic character growth. Here's an example that should help.

Hope this helps with bringing the best out of the merge!

Olafangensan

Jul 31, 2024

I was about to edit my message to ask for tips an tricks, but I see it was unnecessary.

Bless!

MarinaraSpaghetti pinned discussion Jul 31, 2024

traveltube

Jul 31, 2024

•

edited Jul 31, 2024

Honestly, love it - this feels like a slightly more toned down, slightly more intelligent version of mini-magnum (which was possibly getting a bit too unhinged for me). This would probably be my go-to from now on for the time being!

Edit: on further testing, mini-magnum maybe has a slightly more "fun" way of writing... perhaps it's the repetitiveness of content issue as someone else mentioned. Guess I'll play around with the rep/ presence penalty/ DRY or switch between the two depending on what I'm using the LLM for! I do still like its general consistency!

MarinaraSpaghetti

Owner Jul 31, 2024

Thank you for the feedback, and super happy you like it!

DontPlanToEnd

Jul 31, 2024

@DontPlanToEnd Can I get your discord handle

It's da same as my username, dontplantoend. God I wish I didn't choose this username lol. It's so much of a statement. It's just a random song lyric.

Lambent

Aug 1, 2024

I did an EQ-Bench test run of this model and got the following results:

Tasks	Version	Filter	n-shot	Metric		Value		Stderr
eq_bench	2.1	none	0	eqbench	↑	78.9709	±	1.5866
		none	0	percent_parseable	↑	100.0000	±	0.0000

That's better than instruct! :D Pretty cool.

Been running it locally and it's an interesting critter, can follow instructions quite well enough, but also quite happy to introduce itself as a software developer looking for a job like a base model would. You can feel the base/instruct hybrid. I like it.

MarinaraSpaghetti

Owner Aug 1, 2024

Oh wow, I honestly did not expect it to be so smart, haha. Thank you, super glad you like it!

traveltube

Aug 2, 2024

•

edited Aug 4, 2024

I added 0.05 to rep penalty and turned DRY to 0.8/2 with 0 penalty range, works like a charm now! For me, it shows a nice mix of intelligence and creativity. Noticeably, this model seems to follow the system prompt quite well and I was able to be pretty particular about that, yielding some good results.

https://pastebin.com/SMyJ46Wt I use the instruct settings here - I know people say that telling it "don't do these things" usually doesn't work with LLMs before, but it seems to work decently with this one for me.

Notably I also had a line in there for myself in the Don'ts section that I didn't include in the pastebin, because I'm not sure if this one works that well - "Answering or performing actions as {{user}} - only represent the other character(s) instead." It seems to work for me but I'm not sure if it's just my luck, so that can be tested too.

Edit: Honestly yeah the model does have its problems with the repetition issue and some recurrent GPT-isms, but its ability to keep on tract and pull back little relevant details from a decently lengthy context is really impressive for a model of this size. Feels really smart for a 12b model.

KerDisren

Aug 4, 2024

•

edited Aug 4, 2024

Feels really nice for 12b model but I got two questions.
I was testing your custom settings and Example Message appears twice in context, once after Example Response section of Story String and the second time immediately after the chat starts. Is this a command line visual bug, is it supposed to be like that or a mistake on my side?
Could you please take a screenshot of your GGUF model loading settings in ooba and share with me?
Thanks for fun merge.

MarinaraSpaghetti

Owner Aug 4, 2024

@KerDisren you need to set Example Message Behavior to „Never Include Examples” on the User Settings tab to not send the example message twice in SillyTavern.
As for Ooba settings, I run the GGUF version so it’s just all default settings plus 64000 context, flash attention flag marked, and that’s it.
Also, thank you for the kind words!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment