wasertech commited on
Commit
f76437d
1 Parent(s): f46cbfb

update custom message

Browse files
Files changed (1) hide show
  1. constants.py +1 -1
constants.py CHANGED
@@ -112,7 +112,7 @@ Models are sorted by consistancy in their results across testsets. (by increasin
112
  ### Results
113
  The CommonVoice Test provides a Word Error Rate (WER) within a 20-point margin of the average WER. While not perfect, this indicates that CommonVoice can be a useful tool for quickly identifying a suitable ASR model for a wide range of languages in a programmatic manner. However, it's important to note that it is not sufficient as the sole criterion for choosing the most appropriate architecture. Further considerations may be needed depending on the specific requirements of your ASR application.
114
 
115
- Furthermore, it's important to highlight that opting for the model with the lowest WER on CommonVoice typically aligns closely with selecting a model based on the lowest average WER. This approach has consistently proven effective in pinpointing the best-performing models, even if there is a minor 2.01 point difference (with this data) in the average WER between the model chosen from the lowest average WER and the one chosen from the lowest CommonVoice WER. This observation underscores the robustness of our technique in identifying models with low WER, despite this slight variation.
116
 
117
  Additionally, it has come to our attention that Nvidia's models, trained using NeMo with custom splits from common datasets, including Common Voice, may have had an advantage due to their familiarity with parts of the Common Voice test set. It's important to note that this highlights the need for greater transparency in data usage, as OpenAI itself does not publish the data they used for training. This could explain their strong performance in the results. Transparency in model training and dataset usage is crucial for fair comparisons in the ASR field and ensuring that results align with real-world scenarios.
118
 
 
112
  ### Results
113
  The CommonVoice Test provides a Word Error Rate (WER) within a 20-point margin of the average WER. While not perfect, this indicates that CommonVoice can be a useful tool for quickly identifying a suitable ASR model for a wide range of languages in a programmatic manner. However, it's important to note that it is not sufficient as the sole criterion for choosing the most appropriate architecture. Further considerations may be needed depending on the specific requirements of your ASR application.
114
 
115
+ Furthermore, it's important to highlight that opting for the model with the lowest WER on CommonVoice typically aligns closely with selecting a model based on the lowest average WER. This approach has consistently proven effective in pinpointing the best-performing models, even if there is a minor 0.01 point difference (with this data) in the average WER between the model chosen from the lowest average WER and the one chosen from the lowest CommonVoice WER. This observation underscores the robustness of our technique in identifying models with low WER, despite this slight variation, which may be due to statistical noise.
116
 
117
  Additionally, it has come to our attention that Nvidia's models, trained using NeMo with custom splits from common datasets, including Common Voice, may have had an advantage due to their familiarity with parts of the Common Voice test set. It's important to note that this highlights the need for greater transparency in data usage, as OpenAI itself does not publish the data they used for training. This could explain their strong performance in the results. Transparency in model training and dataset usage is crucial for fair comparisons in the ASR field and ensuring that results align with real-world scenarios.
118