Will there ever be a Wizard-2 8x22 version?

#5
by UniversalLove333 - opened

Thank you!!

Owner

@UniversalLove333 there are two things in order for that to happen. I need to pre-train this model to make it production usable. Then if the wizard group decides to fine tune it, then yes, since they haven’t shared their datasets in a year now I think we’d just have to wait on them to fine tune the models if we want a “Wizard” model from now on 🙁

Is it possible to merge the WizardLM-8x22B experts together using the techniques that you used here, and then to heal it, rather than healing Mistral-22B and then finetuning it with WizardLM-2 data?

Sign up or log in to comment