Really impressive!

#1
by Thireus - opened

Just wanted to say that this model is really impressive. It is able to follow instructions very well, which I found to be on par with 70b models!

Keep up the good work!

Cognitive Computations org

Thank you!

Cognitive Computations org

sooo... I just feel like dolphin-2.2 / 2.2.1 are somehow worse than dolphin 2.1
I can't really tell why I feel that way.
does it seem... off to you? Can I get feedback from the community?
The training methods are exactly the same, the only difference is the data.

I will be testing today and compare to dolphin-2.1-mistral-7b.Q8_0

for me dolphin 2.1 gives better responses, more clear, communicative and articulate. dolphin-2.2.1 feels a tiny bit 'blunter' in comparison

dolphin 2.1 is absolutely amazing!

Was considering going back to 2.1.
But after lowering Repeat Penalty from 1.1 to 1.0 the responses significantly improved.
I mainly tested only for story generation with 1.0 temperature.
I'm feeling like I'm staying with 2.2.1

Was considering going back to 2.1.
But after lowering Repeat Penalty from 1.1 to 1.0 the responses significantly improved.
I mainly tested only for story generation with 1.0 temperature.
I'm feeling like I'm staying with 2.2.1

Story generation with such small model ??
In my experience the best and creative forcstiries ate as big models as possible ...70b are the best because they have the most knowledge and can be very creative.

Well I'm mostly interested in short scenes, short stories and abilities to quickly generate reasonable short story, which can also be tweaked, interacted with, and changed.

Was considering going back to 2.1.
But after lowering Repeat Penalty from 1.1 to 1.0 the responses significantly improved.
I mainly tested only for story generation with 1.0 temperature.
I'm feeling like I'm staying with 2.2.1

Hey I'll try that setting, thanks. Never seen a 7B model be so Poetic, elaborate and coherent when it comes to storytelling (Dolphin-Mistral in general)

Cognitive Computations org
edited Nov 1, 2023

I was able to test it only very briefly, but from that short interaction 2.2.1 did seem like being one of the best I've seen, so far. Potentially.
I'll test it properly over next weekend.

I've found that censorship of models can even negatively affect things that wouldn't fall under their list of things that shouldn't be mentioned, so when testing models, I usually ask them to write stories about content that models would normally consider unacceptable. I also test it by doing things like summarizing and asking questions that require previous context to be answered properly.

2.1 loves to frequently add words like dark or twisted and will talk about perseverance whenever something bad happens while always trying to guide the story towards a happy ending or redemption arc. This leads to very boring writing, and even if you try to shift the tone of it, it will still try to make it seem like every action has some sort of darker intention, then talk about perseverance or something about life-changing events.

2.2.1, on the other hand, actually creates something interesting to read, and the interactions between characters feel more real. Unlike 2.1, it doesn't explicitly state the nature of the situation but instead shows it through dialogue and narration. It doesn't try to guide the story towards a happy ending or redemption arc, and it will accurately create a story based on your instructions.

Both models ability to summarize is roughly the same, but 2.1 tends to replace NSFW information with more vague language, while 2.2.1 keeps the information as is.

When it comes to answering questions based on previous context, both models provide good answers, but I find that 2.1 likes to add more fluff words.

Overall, I think 2.2.1 is a massive improvement towards uncensoring the model (though not in the typical way) as well as the creativity of character interactions. I think that this is likely due to the fact that it knows more about human interactions with the addition of the Samantha data.

Cognitive Computations org

Congrats Eric! you are awesome!

Cognitive Computations org

I would compare both 2.1 and 2.2.1

It does seem to slip up a bit more compared to 2.1.

I have a concrete example where with an internal company documentation RAG if I ask how do I release a new backend version 1.2.3, with 2.1 it replaces the 4 out 4 example versions from the documentation, but 2.2.1 only does 3 and the 4th stays the same.

sooo @daaain which one did better job?

@mirek190 sorry, it was a bit unclear!

So the context from the documentation had these (admittedtly a bit messy) command examples for deployment:

git commit staging/version.sh -m 'staging/version.sh update for 1.1.1-rc.0' && git tag -a 1.1.1-rc.0 -m '' && git push origin master
git commit production/version.sh -m 'staging/version.sh update for 1.1.1' && git tag -a 1.43.8 -m '' && git push origin master

Without any explicit prompting – asking How do I do server release 1.115.29-rc.0 to staging – each version did a good job figuring out that they should update the versions.

With 2.2.1 I got this – almost good, but the last tag stayed unchanged:

git commit staging/version.sh -m 'staging/version.sh update for 1.115.29-rc.0' && git tag -a 1.115.29-rc.0 -m '' && git push origin master
git commit production/version.sh -m 'staging/version.sh update for 1.115.29' && git tag -a 1.43.8  -m '' && git push origin master

But 2.1 got it perfectly right! The only way I could be more impressed if it also figured out that the commit message for production is wrong 😹

git commit staging/version.sh -m 'staging/version.sh update for 1.115.29-rc.0' && git tag -a 1.115.29-rc.0 -m '' && git push origin master
git commit production/version.sh -m 'staging/version.sh update for 1.115.29' && git tag -a 1.115.29 -m '' && git push origin master

Sign up or log in to comment