elgeish commited on
Commit
9df034d
1 Parent(s): e904c32

update README with training details

Browse files
Files changed (2) hide show
  1. README.md +28 -4
  2. validation_wer.png +0 -0
README.md CHANGED
@@ -25,13 +25,16 @@ model-index:
25
  - name: Test WER
26
  type: wer
27
  value: 26.55
 
 
 
28
  ---
29
 
30
  # Wav2Vec2-Large-XLSR-53-Arabic
31
 
32
  Fine-tuned [facebook/wav2vec2-large-xlsr-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53)
33
- on Arabic using the [Common Voice](https://huggingface.co/datasets/common_voice)
34
- and the [Arabic Speech Corpus](https://huggingface.co/datasets/arabic_speech_corpus) datasets.
35
  When using this model, make sure that your speech input is sampled at 16kHz.
36
 
37
  ## Usage
@@ -174,5 +177,26 @@ print(f"WER: {metrics['wer']:.2%}")
174
 
175
  ## Training
176
 
177
- You can find the script used to produce this model
178
- [here](https://github.com/elgeish/transformers/blob/cfc0bd01f2ac2ea3a5acc578ef2e204bf4304de7/examples/research_projects/wav2vec2/finetune_base_arabic_speech_corpus.sh).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
25
  - name: Test WER
26
  type: wer
27
  value: 26.55
28
+ - name: Validation WER
29
+ type: wer
30
+ value: 23.39
31
  ---
32
 
33
  # Wav2Vec2-Large-XLSR-53-Arabic
34
 
35
  Fine-tuned [facebook/wav2vec2-large-xlsr-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53)
36
+ on Arabic using the `train` splits of [Common Voice](https://huggingface.co/datasets/common_voice)
37
+ and [Arabic Speech Corpus](https://huggingface.co/datasets/arabic_speech_corpus).
38
  When using this model, make sure that your speech input is sampled at 16kHz.
39
 
40
  ## Usage
 
177
 
178
  ## Training
179
 
180
+ For more details, see [Fine-Tuning with Arabic Speech Corpus](https://github.com/huggingface/transformers/tree/1c06240e1b3477728129bb58e7b6c7734bb5074e/examples/research_projects/wav2vec2#fine-tuning-with-arabic-speech-corpus).
181
+
182
+ This model represents Arabic in a format called [Buckwalter transliteration](https://en.wikipedia.org/wiki/Buckwalter_transliteration).
183
+ The Buckwalter format only includes ASCII characters, some of which are non-alpha (e.g., `">"` maps to `"أ"`).
184
+ The [lang-trans](https://github.com/kariminf/lang-trans) package is used to convert (transliterate) Arabic abjad.
185
+
186
+ [This script](https://github.com/huggingface/transformers/blob/1c06240e1b3477728129bb58e7b6c7734bb5074e/examples/research_projects/wav2vec2/finetune_large_xlsr_53_arabic_speech_corpus.sh)
187
+ was used to first fine-tune [facebook/wav2vec2-large-xlsr-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53)
188
+ on the `train` split of the [Arabic Speech Corpus](https://huggingface.co/datasets/arabic_speech_corpus) dataset;
189
+ the `validation` split was used for model selection; the resulting model at this point is saved as [elgeish/wav2vec2-large-xlsr-53-levantine-arabic](https://huggingface.co/elgeish/wav2vec2-large-xlsr-53-levantine-arabic).
190
+
191
+ Training was then resumed using the `train` split of the [Common Voice](https://huggingface.co/datasets/common_voice) dataset;
192
+ similarly, the `validation` split was used for model selection;
193
+ training was stopped to meet the deadline of [Fine-Tune-XLSR Week](https://github.com/huggingface/transformers/blob/700229f8a4003c4f71f29275e0874b5ba58cd39d/examples/research_projects/wav2vec2/FINE_TUNE_XLSR_WAV2VEC2.md):
194
+ this model is the checkpoint at 100k steps and a validation WER of **23.39%**.
195
+
196
+ <img src="validation_wer.png" alt="Validation WER" width="50%" />
197
+
198
+ It's worth noting that validation WER is trending down, indicating the potential of further training (resuming the decaying learning rate at 7e-6).
199
+
200
+ ## Future Work
201
+ One area to explore is using `attention_mask` in model input, which is recommended [here](https://huggingface.co/blog/fine-tune-xlsr-wav2vec2).
202
+ Also, exploring data augmentation using datasets used to train models listed [here](https://paperswithcode.com/sota/speech-recognition-on-common-voice-arabic).
validation_wer.png ADDED