No punctuation

#3
by Sogl-coder - opened

Compared to the original large-v2 (or just large) the output has no punctuation, proper names with a small letter, and there are artifacts in words.

Example:
telegram-cloud-photo-size-2-5201882072304701535-y.jpg

Yes, this is expected. This model was trained on a Russian dataset that I had access to that had been preprocessed with a particular focus in mind. Thus, if I recall correctly, all punctuation is removed and all words are lower-cased. I'm not sure about the artifacts in words however.

mitchelldehaven changed discussion status to closed

effort - πŸ†
result - πŸ’©

So original whisper is just better lol..

This comment has been hidden

If you need case and punctuation, then yes you should use the original v2 model, or the new v3 model.

In un-cased and non-punctuation contexts, this model will likely have a lower WER than the original v2 model, particularly in noisy environments. I'm unsure about the v3 model, as I haven't tested it for Russian, but I assume v3 would be better as it improved substantially on non-English languages.

Can you finetune to russian version 3?

Unfortunately I cannot, I do not have access to the compute resource I used for this any more.

Sign up or log in to comment