About Moe vocab extended model with non vocab extended model

by ancv - opened Mar 25

ancv

Mar 25

Hi @mlabonne ,
Thank for your great model. Btw, I have a specific question regarding the Moe model. I have a vocab extended Mistral 7B to be better for Vietnamese language, I want to Moe it with chat, code and math based on Mistral 7B to enhance model's capabilities. Is it possible? If the model has some differences in token ids between extended and non extended, then after merging I fintune with an amount of data (about 1B tokens), will the model be better?

mlabonne

Owner Mar 25

Hi @ancv , thanks! Yes, this should be possible. Fine-tuning should definitely help too, although it might be more cost-efficient to fine-tune your experts instead.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment