airesearch
/

wav2vec2-large-xlsr-53-th

Automatic Speech Recognition

hf-asr-leaderboard

robust-speech-event

xlsr-fine-tuning

Inference Endpoints

Model card Files Files and versions Community

cstorm125 commited on Sep 1, 2021

Commit

8b5f968

•

1 Parent(s): 5d0cede

Update README.md

use correct tokenization for wer

Files changed (1) hide show

README.md +5 -4

README.md CHANGED Viewed

@@ -117,10 +117,11 @@ training_args = TrainingArguments(
 We benchmark on the test set using WER with words tokenized by [PyThaiNLP](https://github.com/PyThaiNLP/pythainlp) 2.3.1 and CER. We also measure performance when spell correction using [TNC](http://www.arts.chula.ac.th/ling/tnc/) ngrams is applied. Evaluation codes can be found in `notebooks/wav2vec2_finetuning_tutorial.ipynb`
-|                          | WER        | CER        |
-|--------------------------|------------|------------|
-| without spell correction | 0.20755690 | 0.02813019 |
-| with spell correction    | 0.24592172 | 0.05225761 |
 ## Ackowledgements
 * model training and validation notebooks/scripts [@cstorm125](https://github.com/cstorm125/)

 We benchmark on the test set using WER with words tokenized by [PyThaiNLP](https://github.com/PyThaiNLP/pythainlp) 2.3.1 and CER. We also measure performance when spell correction using [TNC](http://www.arts.chula.ac.th/ling/tnc/) ngrams is applied. Evaluation codes can be found in `notebooks/wav2vec2_finetuning_tutorial.ipynb`
+|                               | WER        | CER        |
+|-------------------------------|------------|------------|
+| Ours without spell correction | 0.13634024 | 0.02813019 |
+| Ours with spell correction    | 0.17996397 | 0.05225761 |
+| Google Web Speech API         | 0.13711234 | 0.07357340 |
 ## Ackowledgements
 * model training and validation notebooks/scripts [@cstorm125](https://github.com/cstorm125/)