How do you fine tune Whisper for classification task rather than transcription?

#1
by nkburns - opened

Hello, @sanchit-gandhi I see you trained a language id model, do you have a blog or a notebook you can share that describes how you fine-tuned for this task?

Hey @nkburns - thanks for reaching out! Currently I don't, but want to have one ready by next week! Will probably share on Twitter (@sanchitgandhi99) when complete

Hello, @sanchit-gandhi - I'm also interested in fine-tuning your pytorch model in my audio classification task. If there is any update, would you update here, instead of Twitter? Thanks alot.

Hey @nkburns - thanks for reaching out! Currently I don't, but want to have one ready by next week! Will probably share on Twitter (@sanchitgandhi99) when complete

Hello, @sanchit-gandhi thank you very much for the excellent work you have done, I am also very curious about finetune Whisper for audio classification task, any update about this issue would be greatly appreciated🤗

Hi, @sanchit-gandhi . Could you say, please, did you share a notebook, explaining the training process of this model?!

Hey all! In the end I didn't get round to a notebook, but you can reproduce the training steps by using the example script in Transformers and this launch command.

Is there still interest for a Colab? I could put one together if there's sufficient community interest for one (feel free to upvote this comment if so)

Hi @sanchit-gandhi ,

I've reviewed scripts you shared, and other examples for audio-classification and have also put together a notebook based on it. I would really appreciate it if you could take some time to run through the notebook and provide feedback, especially pointing out any major mistakes.

Thanks in advance for your help!
Link for repo with notebook: https://github.com/fitlemon/whisper-small-uz-en-ru-lang-id

Sign up or log in to comment