Merging Models of Different Parameter Size

#1
by arhanovich - opened

Do You Have a successful method for Merging Models of Different Parameter Size.

Kinda, I did some shenanigans with mergekit here in the breaking_math and weighting_interp branch, but I think it's just adding noise, I'm not sure if it's adding something useful to the small model, because in the open-llm-leaderboard it's degrading the model, and some tend to generate gibberish/(weird words) at some time.

Here I tried a bigger one teknium/OpenHermes-2-Mistral-7B with 40 layers + KoboldAI/LLaMA2-13B-Tiefighter, but I don't think it's working as intend.

Sign up or log in to comment