--- license: mit datasets: - mozilla-foundation/common_voice_13_0 language: - ca - bg - cs - fi - gl - hi - hu - pl - ro - sk - ta - th tags: - automatic-speech-recognition inference: false pipeline_tag: automatic-speech-recognition --- ## About Multilingual Distilwhisper allows for better ASR performance in target languages by adding lightweight CLSR modules on top of whisper-small. These modules are trained on a mix of cross-entropy (ASR) and knowledge distillation losses, where whisper-large-v2 is used as teacher. ## Inference Code for training and inference at: https://github.com/naver/multilingual-distilwhisper ## Citation ``` @inproceedings{ferraz2024distilwhisper, title={Multilingual DistilWhisper: Efficient Distillation of Multi-task Speech Models via Language-Specific Experts}, author={Ferraz, Thomas Palmeira and Boito, Marcely Zanon and Brun, Caroline and Nikoulina, Vassilina}, booktitle={ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)}, year={2024}, organization={IEEE} } ```