Would there be confusion between CER and WER on test metrics ?

#1
by nikokks - opened

Hello!

In view of the results which are very impressive, I am a little surprised that the best wav2vec2 reference model trained 6 months ago on the CV7.0 has results comparable to its CER for your WER. is it the WER or the CER on your test? I don't have time to check it myself.

Have a nice day !

see by yourself with these 2 links
https://paperswithcode.com/sota/speech-recognition-on-common-voice-7-0-german-1
https://paperswithcode.com/sota/speech-recognition-on-common-voice-7-0-german?metric=Test%20WER

The links you provided are for German, though this is the French model card. I assume you are asking the question for french - https://paperswithcode.com/sota/automatic-speech-recognition-on-mcv-7-0

Yes, the results calculated here are WER, not CER. We normally do not publish CER scores for languages where WER can be computed.

There are a few reasons for this -

  1. This is a Conformer Transducer - Transducer models are much more accurate than CTC models in general. Conformer CTC is also more accurate than Wav2Vec CTC in nearly all cases.
  2. These models are jointly trained - note that they train via both MCV + MLS French, so it is expected that their overall score on MCV alone is superior to a model that was trained on just MCV.

Sign up or log in to comment