facebook/mms-1b-fl102 · how to reproduce the fine-tuning

Jun 30, 2023

Hi there,
I have reasonably large amounts of data on multiple low resources langauges which i believe would further lower the wer on those langauges post finetuning.

But i am unable to understand how to go about finetuning multiple languages.

could you please share the links/resources of how to reproduce such multi-language finetuning

StephennFernandes

Jul 19, 2023

gently pinging @sanchit-gandhi @patrickvonplaten

Hi guys,

is there a way anyone could reproduce such multiple languages fine-tuning in HF ?

sanchit-gandhi

Jul 25, 2023

•

edited Jul 25, 2023

MMS fine-tuning only updates the weights for language-specific adapter layers (see MMS ASR blog post), so there's not really a notion of having a single adapter for multiple languages. Since you have large amounts of data, what you can do first is traditional CTC fine-tuning on the multiple languages, where you fine-tune the entire model and make predictions with a joint vocabulary output layer. This will improve the model on all the languages of interest. As a second step, the joint vocabulary output layer can be thrown away and language-specific adapter layers fine-tuned as per the MMS ASR blog post. In doing so, you should be able to leverage the linguistic knowledge of the base model across languages.

So in short:

Traditional CTC fine-tuning of the entire model on the multiple languages (joint vocabulary output)
MMS ASR fine-tuning of the adapter weights on your specific language (single language output)

StephennFernandes

Jul 25, 2023

@sanchit-gandhi thanks a ton sanchit for such a clear and detailed explanation. I really appreciate your help 🙏