Allow large whisper model?

#1
by saattrupdan - opened

The large whisper model is performing substantially better than the other sizes. It’s a lot slower, sure, but it would nice to have the option 😊

I will add it.
I will also slowly try to add some better finetuned models in each language from this list: https://huggingface.co/spaces/whisper-event/winners?dataset=mozilla-foundation%2Fcommon_voice_11_0
I will need to convert those models to @ggerganov implementation. Not sure whether my conversion works yet from hf models but this seems promising https://github.com/ggerganov/whisper.cpp/issues/325
There is for example Danish medium model with 13.71 WER on CV11 compared to 14.4 with Whisper-large-v2 on CV9 from the paper
image.png

Large model added

Great, thanks!

As a potential alternative to the Whisper models, there are more light-weight ones that perform (at least) as good. In Danish we for instance have this Wav2Vec2 model, which achieves a WER of 10.8% without a language model:
https://huggingface.co/chcaa/xls-r-300m-danish-nst-cv9

saattrupdan changed discussion status to closed

Sign up or log in to comment