mistral-goliath-12b ?

#6
by edwardDali - opened

I heard goliath 120b is at GPT4 level for some benchmarks, it it possible to use the same merge techniques and generate a merge of 2 mistral models? should be interesting if same capabilities are amplified as well. Maybe a merge of 3 models would be even stronger :)

I heard goliath 120b is at GPT4 level for some benchmarks, it it possible to use the same merge techniques and generate a merge of 2 mistral models? should be interesting if same capabilities are amplified as well. Maybe a merge of 3 models would be even stronger :)

There is no publically available 70B Mistral yet though.

The 7b models are quite strong. Maybe merging multiple small models will improve the overall result. I don't know... newbie here. But your Goliath experiment seems to indicate a valid path.

The 7b models are quite strong. Maybe merging multiple small models will improve the overall result. I don't know... newbie here. But your Goliath experiment seems to indicate a valid path.

It's not the way you think it works, adding like 3 same models, would be quite a amount to layer duplicasy which would eventually lead to a garbage model if not fine tuned further

Sign up or log in to comment