How to merge models into moe?

#1
by Yhyu13 - opened

Hi,

Just curious how do you create custom mixtral style models? Do they all have to be mistral derivatives and same size?

Thanks!

@Yhyu13
Before you try this make sure you have a lot of ram or a big swap file (this will make it take forever)

I think you can do this with llama models, but all the models used have to be of the same type and size.
So you can't mix llama models with mistral models, or 7B with 13B.

How to use

git clone https://github.com/cg123/mergekit
cd mergekit
git switch mixtral
git pull

# Use python venv or conda
pip install -e .

# if you want to use the --load-in-4bit or --load-in-8bit flag
pip install scipy bitsandbytes

@Yhyu13
Before you try this make sure you have a lot of ram or a big swap file (this will make it take forever)

I think you can do this with llama models, but all the models used have to be of the same type and size.
So you can't mix llama models with mistral models, or 7B with 13B.

How to use

git clone https://github.com/cg123/mergekit
cd mergekit
git switch mixtral
git pull

# Use python venv or conda
pip install -e .

# if you want to use the --load-in-4bit or --load-in-8bit flag
pip install scipy bitsandbytes

yes, I totally agree. thanks for your replay.

Sign up or log in to comment