Here is my recipe: 1. Expand the layers of NeuralBeagle to 10.7B ala frankenmerge. 2. DPO-tune the previous model with a high-quality preference dataset, argilla/distilabel-intel-orca-dpo-pairs 3. Merge the previous model with CarbonVillain (needs —allow-crimes in mergekit! 🔪)
And here is the resulting model, CarbonBeagle-11B, which ranked top in the leaderboard for its size class: vicgalle/CarbonBeagle-11B