how to reproduce the fine-tuning

#2
by StephennFernandes - opened

Hi there,
I have reasonably large amounts of data on multiple low resources langauges which i believe would further lower the wer on those langauges post finetuning.

But i am unable to understand how to go about finetuning multiple languages.

could you please share the links/resources of how to reproduce such multi-language finetuning

gently pinging @sanchit-gandhi @patrickvonplaten

Hi guys,

is there a way anyone could reproduce such multiple languages fine-tuning in HF ?

MMS fine-tuning only updates the weights for language-specific adapter layers (see MMS ASR blog post), so there's not really a notion of having a single adapter for multiple languages. Since you have large amounts of data, what you can do first is traditional CTC fine-tuning on the multiple languages, where you fine-tune the entire model and make predictions with a joint vocabulary output layer. This will improve the model on all the languages of interest. As a second step, the joint vocabulary output layer can be thrown away and language-specific adapter layers fine-tuned as per the MMS ASR blog post. In doing so, you should be able to leverage the linguistic knowledge of the base model across languages.

So in short:

  1. Traditional CTC fine-tuning of the entire model on the multiple languages (joint vocabulary output)
  2. MMS ASR fine-tuning of the adapter weights on your specific language (single language output)

@sanchit-gandhi thanks a ton sanchit for such a clear and detailed explanation. I really appreciate your help 🙏

Sign up or log in to comment