Is it possible to make Midnight-Mixtral?

by ddh0 - opened


I think Midnight-Miqu is the absolute best RP model around. Would it be possible to make a similar model, but based on Mixtral 8x7B? This would be good because Mixtral is still very smart but also much faster and easier to run than a 70B dense model. Thoughts?

I don't think it's possible to make a Midnight-Mixtral via merging due to major differences between Llama 2 / Miqu 70B and Mixtral's 8x7b architecture. Instead, it might be possible for someone to finetune Mixtral on a high-quality synthetic dataset produced with help from Midnight Miqu, but no such dataset exists yet to my knowledge. I've been kicking around the idea of trying to produce such a dataset myself. If I manage to do it, I'll release the dataset on my HF page and maybe someone will run with it.

That dataset could be amazing, looking forward to it.

Sign up or log in to comment