Best method for fine tunning whisper

#1
by Carlosperez - opened

Hey, we have a huge data set of transcribed news content from TV and Radio, would it be possible to fine tunne whisper to improve the results? Is speaker identificaction problem solved?

Hey @Carlosperez ! You can follow this blog post here for fine-tuning Whisper: https://huggingface.co/blog/fine-tune-whisper And follow this guide for getting your data into the format expected by HF datasets: https://huggingface.co/docs/datasets/audio_dataset#local-files Once you've done this, you can simply swap the Common Voice dataset for your dataset and run the script start to finish. Whisper will transcribe speech no matter who is speaking (it's trained purely to transcribe speech, not identify who is speaking so doesn't pay this any regard)

Sign up or log in to comment