Are you interested in doing a Kunoichi esque merge by chance? (Also prompt format)

#1
by ParasiticRogue - opened

Hi! I was wondering if you wanted to bring your skills over to 34B and do something similar to the Maid or Kunoichi merges. There's a small. but decent line-up of models to use in the Vicuna mold if you are. Also, is the format preferred for this model still that weird SUS template, or did you change it?

It's still the SUS prompting to keep it consistent. This train just came out of the oven so I haven't yet poked it at all - I uploaded it mostly to get it off of the RunPod it was trained on, heh.

I'm going to take a swing at some 34B models. It's the first time I've tried messing with them so it might take a few swings before I get something worthy :)

You could probably ask Brucethemoose if you need any experienced tips for these bigger models, though he seems to still be focused on the big conga-line method with merging. We've been exchanging ideas back-and-forth in a thread here if you want some more detailed info.

https://huggingface.co/brucethemoose/jondurbin_bagel-dpo-34b-v0.2-exl2-4bpw-fiction/discussions/1

The TL;DR I'd give you for models and methods for later is this:
Capybara is still the best, but if you have troubles using it in Mergekit then I'd recommend Bruce's fixed copy.
Nontoxic-Bagel is the least bugged Bagel flavor for RP/Story, and if you get into a situation where it stops midway during the merging process then I would recommend just using 2 copies of Nontoxic-Bagel and slerp them together, since that worked for me when doing bigger merges.
There's a lot of Tess derivatives, but from personal experience I just prefer using Tess 1.4 over the prior versions, and the Pallas finetunes of 1.4 are slightly weaker in creative writing skills (wouldn't recommend his laser versions, or any of the normal versions below 4 if you really want to have go at them, since he wasn't aware of some bugs up to v3)
Nyakura V2 is your go-to RP model for 34B.
Only other unique Vicuna model is Platypus, but it's just "ok" in my eyes.
And for the base model when doing your merge I'd recommend Chargoddard's Yi-34B-200K-Llama variant, just so you don't have to deal with any of Yi stuff, which can sometimes be incompatible with other things.

Anyway...I hope this gives you a good starting point, and once again I'd love to see you try your hand on this model size!

Sign up or log in to comment