cloudyu/Mixtral_7Bx2_MoE · How to merge models into moe?

Yhyu13

Dec 23, 2023

•

edited Dec 23, 2023

Hi,

Just curious how do you create custom mixtral style models? Do they all have to be mistral derivatives and same size?

Thanks!

Virt-io

Dec 24, 2023

@Yhyu13
Before you try this make sure you have a lot of ram or a big swap file (this will make it take forever)

I think you can do this with llama models, but all the models used have to be of the same type and size.
So you can't mix llama models with mistral models, or 7B with 13B.

How to use

git clone https://github.com/cg123/mergekit
cd mergekit
git switch mixtral
git pull

# Use python venv or conda
pip install -e .

# if you want to use the --load-in-4bit or --load-in-8bit flag
pip install scipy bitsandbytes

cloudyu

Owner Dec 24, 2023

@Yhyu13
Before you try this make sure you have a lot of ram or a big swap file (this will make it take forever)

I think you can do this with llama models, but all the models used have to be of the same type and size.
So you can't mix llama models with mistral models, or 7B with 13B.

How to use
git clone https://github.com/cg123/mergekit
cd mergekit
git switch mixtral
git pull

# Use python venv or conda
pip install -e .

# if you want to use the --load-in-4bit or --load-in-8bit flag
pip install scipy bitsandbytes

yes, I totally agree. thanks for your replay.