aapot commited on
Commit
9dca1cc
1 Parent(s): 4fffd19

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +51 -15
README.md CHANGED
@@ -29,6 +29,20 @@ model-index:
29
  - name: Test CER
30
  type: cer
31
  value: 1.40
 
 
 
 
 
 
 
 
 
 
 
 
 
 
32
  ---
33
 
34
  # Wav2Vec2-base-fi-voxpopuli-v2 for Finnish ASR
@@ -150,7 +164,9 @@ The pretrained `facebook/wav2vec2-base-fi-voxpopuli-v2` model was initialized wi
150
 
151
  ## Evaluation results
152
 
153
- Evaluation was done with the [Common Voice 7.0 Finnish test split](https://huggingface.co/datasets/mozilla-foundation/common_voice_7_0) and with the [Common Voice 9.0 Finnish test split](https://huggingface.co/datasets/mozilla-foundation/common_voice_9_0). This model's training data includes the training splits of Common Voice 9.0 but our previous models include the Common Voice 7.0 so we ran tests for both versions. Note: Common Voice doesn't seem to fully preserve the test split as fixed between the dataset versions so it is possible that some of the training examples of Common Voice 9.0 are in the test split of the Common Voice 7.0 and vice versa. Thus, test result comparisons are not fully accurate between the models trained with different Common Voice versions but the comparison should still be meaningful enough.
 
 
154
 
155
  ### Common Voice 7.0 testing
156
 
@@ -160,14 +176,15 @@ To evaluate this model, run the `eval.py` script in this repository:
160
  python3 eval.py --model_id Finnish-NLP/wav2vec2-base-fi-voxpopuli-v2-finetuned --dataset mozilla-foundation/common_voice_7_0 --config fi --split test
161
  ```
162
 
163
- This model (the third row of the table) achieves the following WER (Word Error Rate) and CER (Character Error Rate) results compared to our other models and their parameter counts:
164
 
165
- | | Model parameters | WER (with LM) | WER (without LM) | CER (with LM) | CER (without LM) |
166
- |----------------------------------------------------|------------------|---------------|------------------|---------------|------------------|
167
- |Finnish-NLP/wav2vec2-xlsr-1b-finnish-lm-v2 | 1000 million |**4.09** |**9.73** |**0.88** |**1.65** |
168
- |Finnish-NLP/wav2vec2-xlsr-1b-finnish-lm | 1000 million |5.65 |13.11 |1.20 |2.23 |
169
- |Finnish-NLP/wav2vec2-base-fi-voxpopuli-v2-finetuned | 95 million |5.85 |13.52 |1.35 |2.44 |
170
- |Finnish-NLP/wav2vec2-xlsr-300m-finnish-lm | 300 million |8.16 |17.92 |1.97 |3.36 |
 
171
 
172
  ### Common Voice 9.0 testing
173
 
@@ -177,14 +194,33 @@ To evaluate this model, run the `eval.py` script in this repository:
177
  python3 eval.py --model_id Finnish-NLP/wav2vec2-base-fi-voxpopuli-v2-finetuned --dataset mozilla-foundation/common_voice_9_0 --config fi --split test
178
  ```
179
 
180
- This model (the third row of the table) achieves the following WER (Word Error Rate) and CER (Character Error Rate) results compared to our other models and their parameter counts:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
181
 
182
- | | Model parameters | WER (with LM) | WER (without LM) | CER (with LM) | CER (without LM) |
183
- |----------------------------------------------------|------------------|---------------|------------------|---------------|------------------|
184
- |Finnish-NLP/wav2vec2-xlsr-1b-finnish-lm-v2 | 1000 million |**3.72** |**8.96** |**0.80** |**1.52** |
185
- |Finnish-NLP/wav2vec2-xlsr-1b-finnish-lm | 1000 million |5.35 |13.00 |1.14 |2.20 |
186
- |Finnish-NLP/wav2vec2-base-fi-voxpopuli-v2-finetuned | 95 million |5.93 |14.08 |1.40 |2.59 |
187
- |Finnish-NLP/wav2vec2-xlsr-300m-finnish-lm | 300 million |7.42 |16.45 |1.79 |3.07 |
 
188
 
189
 
190
  ## Team Members
29
  - name: Test CER
30
  type: cer
31
  value: 1.40
32
+ - task:
33
+ name: Automatic Speech Recognition
34
+ type: automatic-speech-recognition
35
+ dataset:
36
+ name: FLEURS ASR
37
+ type: google/fleurs
38
+ args: fi_fi
39
+ metrics:
40
+ - name: Test WER
41
+ type: wer
42
+ value: 13.99
43
+ - name: Test CER
44
+ type: cer
45
+ value: 6.07
46
  ---
47
 
48
  # Wav2Vec2-base-fi-voxpopuli-v2 for Finnish ASR
164
 
165
  ## Evaluation results
166
 
167
+ Evaluation was done with the [Common Voice 7.0 Finnish test split](https://huggingface.co/datasets/mozilla-foundation/common_voice_7_0), [Common Voice 9.0 Finnish test split](https://huggingface.co/datasets/mozilla-foundation/common_voice_9_0) and with the [FLEURS ASR Finnish test split](https://huggingface.co/datasets/google/fleurs).
168
+
169
+ This model's training data includes the training splits of Common Voice 9.0 but most of our previous models include the Common Voice 7.0 so we ran tests for both Common Voice versions. Note: Common Voice doesn't seem to fully preserve the test split as fixed between the dataset versions so it is possible that some of the training examples of Common Voice 9.0 are in the test split of the Common Voice 7.0 and vice versa. Thus, Common Voice test result comparisons are not fully accurate between the models trained with different Common Voice versions but the comparison should still be meaningful enough.
170
 
171
  ### Common Voice 7.0 testing
172
 
176
  python3 eval.py --model_id Finnish-NLP/wav2vec2-base-fi-voxpopuli-v2-finetuned --dataset mozilla-foundation/common_voice_7_0 --config fi --split test
177
  ```
178
 
179
+ This model (the first row of the table) achieves the following WER (Word Error Rate) and CER (Character Error Rate) results compared to our other models and their parameter counts:
180
 
181
+ | | Model parameters | WER (with LM) | WER (without LM) | CER (with LM) | CER (without LM) |
182
+ |-------------------------------------------------------|------------------|---------------|------------------|---------------|------------------|
183
+ |Finnish-NLP/wav2vec2-base-fi-voxpopuli-v2-finetuned | 95 million |5.85 |13.52 |1.35 |2.44 |
184
+ |Finnish-NLP/wav2vec2-large-uralic-voxpopuli-v2-finnish | 300 million |4.13 |**9.66** |0.90 |1.66 |
185
+ |Finnish-NLP/wav2vec2-xlsr-300m-finnish-lm | 300 million |8.16 |17.92 |1.97 |3.36 |
186
+ |Finnish-NLP/wav2vec2-xlsr-1b-finnish-lm | 1000 million |5.65 |13.11 |1.20 |2.23 |
187
+ |Finnish-NLP/wav2vec2-xlsr-1b-finnish-lm-v2 | 1000 million |**4.09** |9.73 |**0.88** |**1.65** |
188
 
189
  ### Common Voice 9.0 testing
190
 
194
  python3 eval.py --model_id Finnish-NLP/wav2vec2-base-fi-voxpopuli-v2-finetuned --dataset mozilla-foundation/common_voice_9_0 --config fi --split test
195
  ```
196
 
197
+ This model (the first row of the table) achieves the following WER (Word Error Rate) and CER (Character Error Rate) results compared to our other models and their parameter counts:
198
+
199
+ | | Model parameters | WER (with LM) | WER (without LM) | CER (with LM) | CER (without LM) |
200
+ |-------------------------------------------------------|------------------|---------------|------------------|---------------|------------------|
201
+ |Finnish-NLP/wav2vec2-base-fi-voxpopuli-v2-finetuned | 95 million |5.93 |14.08 |1.40 |2.59 |
202
+ |Finnish-NLP/wav2vec2-large-uralic-voxpopuli-v2-finnish | 300 million |4.13 |9.83 |0.92 |1.71 |
203
+ |Finnish-NLP/wav2vec2-xlsr-300m-finnish-lm | 300 million |7.42 |16.45 |1.79 |3.07 |
204
+ |Finnish-NLP/wav2vec2-xlsr-1b-finnish-lm | 1000 million |5.35 |13.00 |1.14 |2.20 |
205
+ |Finnish-NLP/wav2vec2-xlsr-1b-finnish-lm-v2 | 1000 million |**3.72** |**8.96** |**0.80** |**1.52** |
206
+
207
+ ### FLEURS ASR testing
208
+
209
+ To evaluate this model, run the `eval.py` script in this repository:
210
+
211
+ ```bash
212
+ python3 eval.py --model_id Finnish-NLP/wav2vec2-base-fi-voxpopuli-v2-finetuned --dataset google/fleurs --config fi_fi --split test
213
+ ```
214
+
215
+ This model (the first row of the table) achieves the following WER (Word Error Rate) and CER (Character Error Rate) results compared to our other models and their parameter counts:
216
 
217
+ | | Model parameters | WER (with LM) | WER (without LM) | CER (with LM) | CER (without LM) |
218
+ |-------------------------------------------------------|------------------|---------------|------------------|---------------|------------------|
219
+ |Finnish-NLP/wav2vec2-base-fi-voxpopuli-v2-finetuned | 95 million |13.99 |17.16 |6.07 |6.61 |
220
+ |Finnish-NLP/wav2vec2-large-uralic-voxpopuli-v2-finnish | 300 million |12.44 |**14.63** |5.77 |6.22 |
221
+ |Finnish-NLP/wav2vec2-xlsr-300m-finnish-lm | 300 million |17.72 |23.30 |6.78 |7.67 |
222
+ |Finnish-NLP/wav2vec2-xlsr-1b-finnish-lm | 1000 million |20.34 |16.67 |6.97 |6.35 |
223
+ |Finnish-NLP/wav2vec2-xlsr-1b-finnish-lm-v2 | 1000 million |**12.11** |14.89 |**5.65** |**6.06** |
224
 
225
 
226
  ## Team Members