Heralax/MistralMakise-Merged-13b

Oct 29, 2023

I am an amateur so excuse me for asking. Is this model a Llama or a Mistral model basically?

Owner Oct 29, 2023

Thank you for liking the model! As for your question, in a way,it's both. I created this model by finetuning the ReMM Mistral model made by Undi95 on one of my own datasets; and ReMM mistral is a model created by "merging" different Llama2 and Mistral models. So... it's a hybrid, sorta. Parts of it are Llama, parts of it are Mistral.

Hope that clears things up a bit. Honestly, it's surprising that two differently pretrained models can by merged and still be coherent, but Undi still manages to pull it off time and again.

Alex5000505

Nov 25, 2023

Where can i read more about models metging and glueing ?
The model is very interesting,and now im thinking about how to do it better,if possible

Heralax

Owner Nov 28, 2023

@Alex5000505 You can use this tool to merge https://github.com/cg123/mergekit it's what everyone I know uses
The (as basic as you can possibly get) command I use is something like:

mergekit-legacy ./MistralMakise-13b-Merged/  --base-model ./MistralMakise-13b/ --cuda --merge ./ReMM-Mistral-13B/ --weight 0.3 --density 0.5

I think the non-legacy version of it uses yaml files instead of command line arguments.

I'm not an authority on merging; ReMM Mistral was made by Undi, one of the most experienced mergers in the community; he knows what he's doing with regards to merging, and I do not.
Sadly I don't think there are any good blogposts that walk through the subject. Hardly anything's documented in this community, you just need to ask informally on Discord servers.

Note that if you want to one-up this, the real contribution of this model is the dataset; merging back in just stabilized it a bit (a lot) and made it viable for actual use while still allowing it to capitalize on ReMM's writing ability. Proper professional merges can combine dozens of models using very fancy methods. Most of the top-ranking models on Weicon's leaderboard are merges IIRC.

Alex5000505

Dec 11, 2023

Oh,thanks
About datasets - did you tried to finetune some RP models with psychology datasets ?
I suppose it could give better results,but didn't tried yet on my own.

Heralax
/

MistralMakise-Merged-13b

Awesome model