About model performance

by hieunv3 - opened Jul 11, 2023

Jul 11, 2023

Hi Chien,
Thanks for your excellent work on proposing the fine-tuning whisper model. I am interested in the experiment result you show on the model card.

So, I already grabbed your fine-tuned version and re-run the evaluation, but the result did look as expected. It's even not good as compared to the original model when inference on the test set of the common-voice-11 dataset.

Here is the top 100 sample (WER, CER avg at the end of file) in the test set among 3 different models: original version, your version, and whisper API. Your version in the log is named Finetuned model.

I think that there may be a mismatch when I am using your model, could you please help me clarify my confusion?
Thank you.

vumichien

Owner Jul 11, 2023

Hi Hieu,

I appreciate your interest in my work.

In order to fine-tune the whisper model for the Japanese language, I implemented a different preprocess approach compared to the Open AI team.
This was due to the unique nature of the Japanese language.

If you want to evaluate the WER and CER, splitting the sentence into meaningful words is essential, just like in English.
Below is the code snippet I utilized to assess the fine-tuned model.

https://gist.github.com/vumichien/6bd92a3138c4aeced0f58aa105a60d8b

Best regards

hieunv3

Jul 11, 2023

I am using your script to quickly run first a small set and the result seems similar to your result. But currently, I need time to understand your preprocessing steps, one reason that I don't know Japanese.

Just another comment, if you planning to open-source your work, could you please also share the finetuning script? I am already following the finetuning docs provided by HuggingFace but you have a significantly better approach for Japanese.
It would help me and other people a lot while learning to finetune a whisper model in the Japanese language.

Again, thanks for your work and your help.

hieunv3

Jul 14, 2023

After trying to learn the processing pipeline from your evaluation script and fine-tuning, I already create a fine-tuned version and got a similar result.
Thanks for your help.
I will close this discussion now.

hieunv3 changed discussion status to closed Jul 14, 2023

vumichien

Owner Jul 14, 2023

Hi Hieu,

I am glad to hear that.
If you encounter any issues, please do not hesitate to reopen this discussion.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment