
by arabcoders - opened


Thank you for your improved models, is it possible to make the same upgrades to the newly released whisper large-v3 model? If it's not. Could you release your steps to reproduce the results and i will try to do it myself.

Thank you.

P.S: sorry this meant to be posted in the /clu-ling/whisper-large-v2-japanesee-5k-steps repo.

clu-ling org


Thank you for contacting us. I tried to attach the files I used to fine-tune large-v2 here, but I could not due to restrictions on specific files. So, I used the seq-to-seq script here:

You can use it along with your dataset and any whisper version.



Thank you for contacting us. I tried to attach the files I used to fine-tune large-v2 here, but I could not due to restrictions on specific files. So, I used the seq-to-seq script here:

You can use it along with your dataset and any whisper version.


Thank you, if i may bother you a bit could you go little bit in details on your data? like is it regular audio files with subs or it's something else? if you could expand on the commands you used that would be really helpful. I found your model to be rather big improvements to regular whisper-v2 at least for Japanese.

Thank you again.

P.S: if the reason you aren't upload to upload the data is due to size or something else, we you can contact me and i can provide private hosting for the data.

clu-ling org

Thank you. The data is freely available. It is the common voice dataset:

You can access and download it in any language. If you want to share your email, I can send you the scripts I used.


Thank you. The data is freely available. It is the common voice dataset:

You can access and download it in any language. If you want to share your email, I can send you the scripts I used.


Good to hear i attempted to do it with large-v3 and for me it was failing for some reason i would really appreciate it if you could send the scripts used. You can contact me at

Thank you again and sorry for taking your time.

Sign up or log in to comment