Whisper fine-tuning questions

#96
by kangblue - opened

I want to proceed with fine-tuning of the whisper model with a .wav file, is it possible?
If possible, there will be some example code or a blog that can handle the fine-tuning of hyperlinking files in .wav format.

You can load your .wav files as a Hugging Face audio dataset using this guide: https://huggingface.co/docs/datasets/audio_load
Once loaded, you can push the dataset to the Hub using .push_to_hub("my_dataset_name"). This will save a copy under your namespace kangblue/my_dataset_name.

Once you've done so, you can follow the fine-tuning blog, replacing mozilla-foundation/common_voice_11_0 with kangblue/my_dataset_name. The steps are otherwise unchanged

Sign up or log in to comment