lucio commited on
Commit
53ad625
1 Parent(s): b390dfa

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -5
README.md CHANGED
@@ -23,12 +23,12 @@ model-index:
23
  metrics:
24
  - name: Test WER
25
  type: wer
26
- value: 22.82
27
  ---
28
 
29
  # Wav2Vec2-Large-XLSR-53-lg
30
 
31
- Fine-tuned [facebook/wav2vec2-large-xlsr-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53) on Luganda using the [Common Voice](https://huggingface.co/datasets/common_voice) dataset, using train, validation and other (if the example had more upvotes than downvotes), and taking the test data for validation as well as test.
32
  When using this model, make sure that your speech input is sampled at 16kHz.
33
 
34
  ## Usage
@@ -126,10 +126,11 @@ result = test_dataset.map(evaluate, batched=True, batch_size=8)
126
  print("WER: {:2f}".format(100 * wer.compute(predictions=result["pred_strings"], references=result["norm_text"])))
127
  ```
128
 
129
- **Test Result**: 22.82 %
130
 
131
  ## Training
132
 
133
- The Common Voice `train`, `validation` and `other` datasets were used for training, augmented to twice the original size with added noise and manipulated pitch, phase and intensity.
 
134
 
135
- The script used for training was just the `run_finetuning.py` script provided in OVHcloud's databuzzword/hf-wav2vec image.
23
  metrics:
24
  - name: Test WER
25
  type: wer
26
+ value: 29.52
27
  ---
28
 
29
  # Wav2Vec2-Large-XLSR-53-lg
30
 
31
+ Fine-tuned [facebook/wav2vec2-large-xlsr-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53) on Luganda using the [Common Voice](https://huggingface.co/datasets/common_voice) dataset, using train, validation and other (excluding voices that are in the test set), and taking the test data for validation as well as test.
32
  When using this model, make sure that your speech input is sampled at 16kHz.
33
 
34
  ## Usage
126
  print("WER: {:2f}".format(100 * wer.compute(predictions=result["pred_strings"], references=result["norm_text"])))
127
  ```
128
 
129
+ **Test Result**: 29.52 %
130
 
131
  ## Training
132
 
133
+ The Common Voice `train`, `validation` and `other` datasets were used for training, excluding voices that are in both the `other` and `test` datasets. The data was augmented to twice the original size with added noise and manipulated pitch, phase and intensity.
134
+ Training proceeded for 60 epochs, on 1 V100 GPU provided by OVHcloud. The `test` data was used for validation.
135
 
136
+ The [script used for training](https://github.com/serapio/transformers/blob/feature/xlsr-finetune/examples/research_projects/wav2vec2/run_common_voice.py) is adapted from the [example script provided in the transformers repo](https://github.com/huggingface/transformers/blob/master/examples/research_projects/wav2vec2/run_common_voice.py).