I have reasonably large amounts of data on multiple low resources langauges which i believe would further lower the wer on those langauges post finetuning.
But i am unable to understand how to go about finetuning multiple languages.
could you please share the links/resources of how to reproduce such multi-language finetuning
MMS fine-tuning only updates the weights for language-specific adapter layers (see MMS ASR blog post), so there's not really a notion of having a single adapter for multiple languages. Since you have large amounts of data, what you can do first is traditional CTC fine-tuning on the multiple languages, where you fine-tune the entire model and make predictions with a joint vocabulary output layer. This will improve the model on all the languages of interest. As a second step, the joint vocabulary output layer can be thrown away and language-specific adapter layers fine-tuned as per the MMS ASR blog post. In doing so, you should be able to leverage the linguistic knowledge of the base model across languages.
So in short:
- Traditional CTC fine-tuning of the entire model on the multiple languages (joint vocabulary output)
- MMS ASR fine-tuning of the adapter weights on your specific language (single language output)