FremyCompany commited on
Commit
f6ca04e
1 Parent(s): 7d3ab33

Improve description of the system

Browse files
Files changed (1) hide show
  1. README.md +5 -4
README.md CHANGED
@@ -44,16 +44,17 @@ model-index:
44
  value: 11.26
45
  ---
46
 
47
- # output
48
 
49
- This model is a version of [facebook/wav2vec2-xls-r-2b-22-to-16](https://huggingface.co/facebook/wav2vec2-xls-r-2b-22-to-16) fine-tuned mainly on the MOZILLA-FOUNDATION/COMMON_VOICE_8_0 - NL dataset (see details below).
50
- It achieves the following results on the evaluation set (of Common Voice 8.0):
51
  - Wer: 0.0669
52
  - Cer: 0.0197
53
 
54
  ## Model description
55
 
56
- The model takes 16kHz sound input, and uses a Wav2Vec2ForCTC decoder with 48 letters to output the final result.
 
 
57
 
58
  ## Intended uses & limitations
59
 
44
  value: 11.26
45
  ---
46
 
47
+ # XLS-R-based CTC model with 5-gram language model from Common Voice
48
 
49
+ This model is a version of [facebook/wav2vec2-xls-r-2b-22-to-16](https://huggingface.co/facebook/wav2vec2-xls-r-2b-22-to-16) fine-tuned mainly on the MOZILLA-FOUNDATION/COMMON_VOICE_8_0 - NL dataset (see details below), on which a small 5-gram language model is added based on the Common Voice training corpus. This model achieves the following results on the evaluation set (of Common Voice 8.0):
 
50
  - Wer: 0.0669
51
  - Cer: 0.0197
52
 
53
  ## Model description
54
 
55
+ The model takes 16kHz sound input, and uses a Wav2Vec2ForCTC decoder with 48 letters to output the final result.
56
+
57
+ To improve accuracy, a beam decoder is used; the beams are scored based on 5-gram language model trained on the Common Voice 8 corpus.
58
 
59
  ## Intended uses & limitations
60