Hestia vs harmonia

by anmol989 - opened Nov 26, 2023

Nov 26, 2023

I am testing it right now by generating stories and comparing the results. Will update this post soon!

But yes i am biased hestia will be much better because merging the model again with Noromaid + Nethena gluid which have very high quality data this sounds very good in practice @athirdpath what are your thoughts ?

anmol989

Nov 26, 2023

•

edited Nov 26, 2023

I am using a 700 token super high quality prompt instruction to generate a very specific story in four parts using instruct mode in Koboldcpp and using min p and dynamic entropy as samplers

[1st story] : Hestia - 1 , Harmonia - 0 ( Hestia nailed the entire story. Harmony crafted the part 1 and part 2 but failed at generating the part 3 and part 4 of the story according to the instructions that were given)

[2nd story] : Hestia - 1 , Harmonia - 0 ( Hestia WINS again and i have to say the model is very creative. On the other hand harmonia is very bad at following even some basic instructions)

[3rd story] : Hestia - 1 , Harmonia - 0 (Hestia created a good story this time also. Harmonia did generate the story this time but it was very short and simple )

WINNER - Hestia

athirdpath

Owner Nov 26, 2023

Thank you very much for the feedback!

This holds out to my hypothesis, that Harmonia is a "sloppy" but mallable model, that can take well to further transformation. I've got a 512-rank LORA training on Harmonia right now using the norobots dataset, I'm hoping it will also flex well into that role.

anmol989

Nov 26, 2023

•

edited Nov 26, 2023

Would love to see new models on top of harmonia

And do you know if there is any way to test model performance for story writing other than the ayumi benchmark which only test model roleplay capabilities http://ayumi.m8geil.de/ayumi_bench_v3_results.html ( Your hestia model is not yet tested but it will come in top 50 i bet when they add it)

And no i should be the one thanking you for creating this awesome model :)

athirdpath

Owner Nov 26, 2023

Not anything I'm familiar with. I usually judge models off a mix of vibes, difficult questions/scenarios, and consistency; though I find the LLM leaderboard benchmarks are also good for comparing how two related models run "under the hood", so to speak.

If I'm not mistaken, I saw you link to some dataset resources, but didn't open the link yet. Do you still have that reentry on hand?

And you're welcome!

anmol989

Nov 26, 2023

At first i thought it is just some random nsfw story database but no when i read what it is it turned out to be loli - https://rentry.org/ashh2 ( Warning - Moe , thats why i removed it earlier )

athirdpath

Owner Nov 26, 2023

•

edited Nov 26, 2023

Should have some data there that would be highly effective at shattering alignment, thank you.

athirdpath

Owner Nov 29, 2023

If you are interested, I have a new "flagship" model built from what I learned with Hestia, Iambe-20b-DARE. Any feedback would be appreciated.

anmol989

Nov 29, 2023

Testing it

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment