merge config

#1
by icoderzqliu - opened

Could you share the of merge config your new 124b model? I'm looking forward to it. Thank you!

Could you share the of merge config your new 124b model? I'm looking forward to it. Thank you!

Thank you for recognizing my work! It's actually an 120b recipe(I miscounted the layers) and the config is in the repo of my new 120B model now. In my 124B model card, I might have exaggerated a bit. I don't have any magic sauce to make a mere 120B model compare with strong proprietary models like GPT-4 (though it might match some weaker ones). Overall, I feel the models created with this recipe are smarter than 124B

DisOOM changed discussion status to closed

Sign up or log in to comment