Could you please suggest how to fine-tune the model on top of Swahili

#1
by nathanhunt - opened

I am trying to understand how to fine tune Whisper model to other languages. However, the WhisperTokenizer doesn't support some language (like Kinyarwanda). I see that you can fine tune it on top of Swahili. Could you please suggest how to train like this?

Mbaza NLP org

Hi, I can suggest two options; the first is to pick the host language(e.g.: Swahili in this case) and then train the target language on top of it. The second option is similar to the first one, but you will also train a BPE tokenizer on the target language, add the obtained tokens to the whisper's tokenizer, then train the model.

Thank you very much. I have quickly try to train the model in the first option. I works fine in the target language. But it cannot transcribe to other language. For example, when I try to transcribe Thai audio, the output always be the target language (not Thai). Is this expected for fine-tuned model?

Sorry for the late reply, after fine-tuning the model should still be able to transcribe other languages.

Sign up or log in to comment