Hestia vs harmonia

#1
by anmol989 - opened

I am testing it right now by generating stories and comparing the results. Will update this post soon!

But yes i am biased hestia will be much better because merging the model again with Noromaid + Nethena gluid which have very high quality data this sounds very good in practice @athirdpath what are your thoughts ?

I am using a 700 token super high quality prompt instruction to generate a very specific story in four parts using instruct mode in Koboldcpp and using min p and dynamic entropy as samplers

[1st story] : Hestia - 1 , Harmonia - 0 ( Hestia nailed the entire story. Harmony crafted the part 1 and part 2 but failed at generating the part 3 and part 4 of the story according to the instructions that were given)

[2nd story] : Hestia - 1 , Harmonia - 0 ( Hestia WINS again and i have to say the model is very creative. On the other hand harmonia is very bad at following even some basic instructions)

[3rd story] : Hestia - 1 , Harmonia - 0 (Hestia created a good story this time also. Harmonia did generate the story this time but it was very short and simple )

WINNER - Hestia

Thank you very much for the feedback!

This holds out to my hypothesis, that Harmonia is a "sloppy" but mallable model, that can take well to further transformation. I've got a 512-rank LORA training on Harmonia right now using the norobots dataset, I'm hoping it will also flex well into that role.

Would love to see new models on top of harmonia

And do you know if there is any way to test model performance for story writing other than the ayumi benchmark which only test model roleplay capabilities http://ayumi.m8geil.de/ayumi_bench_v3_results.html ( Your hestia model is not yet tested but it will come in top 50 i bet when they add it)

And no i should be the one thanking you for creating this awesome model :)

Not anything I'm familiar with. I usually judge models off a mix of vibes, difficult questions/scenarios, and consistency; though I find the LLM leaderboard benchmarks are also good for comparing how two related models run "under the hood", so to speak.

If I'm not mistaken, I saw you link to some dataset resources, but didn't open the link yet. Do you still have that reentry on hand?

And you're welcome!

At first i thought it is just some random nsfw story database but no when i read what it is it turned out to be loli - https://rentry.org/ashh2 ( Warning - Moe , thats why i removed it earlier )

Should have some data there that would be highly effective at shattering alignment, thank you.

If you are interested, I have a new "flagship" model built from what I learned with Hestia, Iambe-20b-DARE. Any feedback would be appreciated.

Testing it

Sign up or log in to comment