What's the purpose of this?

#1
by xms991 - opened

Any reason for merging the base model with the instruct model?

edit: apologies if this is a stupid question, I am just curious :)

Yeah, I mostly wanted to try and see if mergekit was compatible with Llama 3 (it is). Now I'm just having fun and evaluating how it performs.

Awesome, can’t wait to see what you and others cook up.

Yeah, I mostly wanted to try and see if mergekit was compatible with Llama 3 (it is). Now I'm just having fun and evaluating how it performs.

I am very curious about your results. Can you share any findings, conclusions? I will be merging Dolphin with Instruct when I'm done downloading. I saw that Dolphin performs significantly worse for "apple" test (10 sentences ending with word "apple") than instruct, and some people reported that "instruct" performs better than Dolphin. I wonder if I could extract the best of both with mergekit.

@Whatever76474758585 Sure, you can find my results on the YALL Leaderboard (search for "llama").

Basically, I realized that Nous' benchmark suite doesn't correctly capture the performance of Llama 3 8B Instruct, which is slightly surprising. Dolphin looks very strong, I'm pretty sure that it's a good basis for high-quality merges. My early ChimeraLlama-3-8B is pretty good (~instruct level) and even manages to follow the ChatML template.

Sign up or log in to comment