|
--- |
|
language: |
|
- sr |
|
license: apache-2.0 |
|
tags: |
|
- automatic-speech-recognition |
|
- mozilla-foundation/common_voice_8_0 |
|
- generated_from_trainer |
|
- robust-speech-event |
|
- xlsr-fine-tuning-week |
|
- hf-asr-leaderboard |
|
datasets: |
|
- mozilla-foundation/common_voice_8_0 |
|
- name: Serbian comodoro Wav2Vec2 XLSR 300M CV8 |
|
results: |
|
- task: |
|
name: Automatic Speech Recognition |
|
type: automatic-speech-recognition |
|
dataset: |
|
name: Common Voice 8 |
|
type: mozilla-foundation/common_voice_8_0 |
|
args: sr |
|
metrics: |
|
- name: Test WER |
|
type: wer |
|
value: 48.5 |
|
- name: Test CER |
|
type: cer |
|
value: 18.4 |
|
model-index: |
|
- name: wav2vec2-xls-r-300m-sr-cv8 |
|
results: |
|
- task: |
|
name: Automatic Speech Recognition |
|
type: automatic-speech-recognition |
|
dataset: |
|
name: Common Voice 8.0 |
|
type: mozilla-foundation/common_voice_8_0 |
|
args: sr |
|
metrics: |
|
- name: Test WER |
|
type: wer |
|
value: 48.53 |
|
- task: |
|
name: Automatic Speech Recognition |
|
type: automatic-speech-recognition |
|
dataset: |
|
name: Robust Speech Event - Dev Data |
|
type: speech-recognition-community-v2/dev_data |
|
args: sr |
|
metrics: |
|
- name: Test WER |
|
type: wer |
|
value: 97.43 |
|
- task: |
|
name: Automatic Speech Recognition |
|
type: automatic-speech-recognition |
|
dataset: |
|
name: Robust Speech Event - Test Data |
|
type: speech-recognition-community-v2/eval_data |
|
args: sr |
|
metrics: |
|
- name: Test WER |
|
type: wer |
|
value: 96.69 |
|
--- |
|
|
|
# Serbian wav2vec2-xls-r-300m-sr-cv8 |
|
|
|
This model is a fine-tuned version of [facebook/wav2vec2-xls-r-300m](https://huggingface.co/facebook/wav2vec2-xls-r-300m) on the common_voice dataset. |
|
It achieves the following results on the evaluation set: |
|
- Loss: 1.7302 |
|
- Wer: 0.4825 |
|
- Cer: 0.1847 |
|
|
|
Evaluation on mozilla-foundation/common_voice_8_0 gave the following results: |
|
|
|
- WER: 0.48530097993467103 |
|
- CER: 0.18413288165227845 |
|
|
|
Evaluation on speech-recognition-community-v2/dev_data gave the following results: |
|
|
|
- WER: 0.9718373107518604 |
|
- CER: 0.8302740620263108 |
|
|
|
The model can be evaluated using the attached `eval.py` script: |
|
``` |
|
python eval.py --model_id comodoro/wav2vec2-xls-r-300m-sr-cv8 --dataset mozilla-foundation/common-voice_8_0 --split test --config sr |
|
``` |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 0.0001 |
|
- train_batch_size: 16 |
|
- eval_batch_size: 8 |
|
- seed: 42 |
|
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 |
|
- lr_scheduler_type: linear |
|
- lr_scheduler_warmup_steps: 300 |
|
- num_epochs: 800 |
|
- mixed_precision_training: Native AMP |
|
|
|
### Training results |
|
|
|
| Training Loss | Epoch | Step | Validation Loss | Wer | Cer | |
|
|:-------------:|:-----:|:-----:|:---------------:|:------:|:------:| |
|
| 5.6536 | 15.0 | 1200 | 2.9744 | 1.0 | 1.0 | |
|
| 2.7935 | 30.0 | 2400 | 1.6613 | 0.8998 | 0.4670 | |
|
| 1.6538 | 45.0 | 3600 | 0.9248 | 0.6918 | 0.2699 | |
|
| 1.2446 | 60.0 | 4800 | 0.9151 | 0.6452 | 0.2398 | |
|
| 1.0766 | 75.0 | 6000 | 0.9110 | 0.5995 | 0.2207 | |
|
| 0.9548 | 90.0 | 7200 | 1.0273 | 0.5921 | 0.2149 | |
|
| 0.8919 | 105.0 | 8400 | 0.9929 | 0.5646 | 0.2117 | |
|
| 0.8185 | 120.0 | 9600 | 1.0850 | 0.5483 | 0.2069 | |
|
| 0.7692 | 135.0 | 10800 | 1.1001 | 0.5394 | 0.2055 | |
|
| 0.7249 | 150.0 | 12000 | 1.1018 | 0.5380 | 0.1958 | |
|
| 0.6786 | 165.0 | 13200 | 1.1344 | 0.5114 | 0.1941 | |
|
| 0.6432 | 180.0 | 14400 | 1.1516 | 0.5054 | 0.1905 | |
|
| 0.6009 | 195.0 | 15600 | 1.3149 | 0.5324 | 0.1991 | |
|
| 0.5773 | 210.0 | 16800 | 1.2468 | 0.5124 | 0.1903 | |
|
| 0.559 | 225.0 | 18000 | 1.2186 | 0.4956 | 0.1922 | |
|
| 0.5298 | 240.0 | 19200 | 1.4483 | 0.5333 | 0.2085 | |
|
| 0.5136 | 255.0 | 20400 | 1.2871 | 0.4802 | 0.1846 | |
|
| 0.4824 | 270.0 | 21600 | 1.2891 | 0.4974 | 0.1885 | |
|
| 0.4669 | 285.0 | 22800 | 1.3283 | 0.4942 | 0.1878 | |
|
| 0.4511 | 300.0 | 24000 | 1.4502 | 0.5002 | 0.1994 | |
|
| 0.4337 | 315.0 | 25200 | 1.4714 | 0.5035 | 0.1911 | |
|
| 0.4221 | 330.0 | 26400 | 1.4971 | 0.5124 | 0.1962 | |
|
| 0.3994 | 345.0 | 27600 | 1.4473 | 0.5007 | 0.1920 | |
|
| 0.3892 | 360.0 | 28800 | 1.3904 | 0.4937 | 0.1887 | |
|
| 0.373 | 375.0 | 30000 | 1.4971 | 0.4946 | 0.1902 | |
|
| 0.3657 | 390.0 | 31200 | 1.4208 | 0.4900 | 0.1821 | |
|
| 0.3559 | 405.0 | 32400 | 1.4648 | 0.4895 | 0.1835 | |
|
| 0.3476 | 420.0 | 33600 | 1.4848 | 0.4946 | 0.1829 | |
|
| 0.3276 | 435.0 | 34800 | 1.5597 | 0.4979 | 0.1873 | |
|
| 0.3193 | 450.0 | 36000 | 1.7329 | 0.5040 | 0.1980 | |
|
| 0.3078 | 465.0 | 37200 | 1.6379 | 0.4937 | 0.1882 | |
|
| 0.3058 | 480.0 | 38400 | 1.5878 | 0.4942 | 0.1921 | |
|
| 0.2987 | 495.0 | 39600 | 1.5590 | 0.4811 | 0.1846 | |
|
| 0.2931 | 510.0 | 40800 | 1.6001 | 0.4825 | 0.1849 | |
|
| 0.276 | 525.0 | 42000 | 1.7388 | 0.4942 | 0.1918 | |
|
| 0.2702 | 540.0 | 43200 | 1.7037 | 0.4839 | 0.1866 | |
|
| 0.2619 | 555.0 | 44400 | 1.6704 | 0.4755 | 0.1840 | |
|
| 0.262 | 570.0 | 45600 | 1.6042 | 0.4751 | 0.1865 | |
|
| 0.2528 | 585.0 | 46800 | 1.6402 | 0.4821 | 0.1865 | |
|
| 0.2442 | 600.0 | 48000 | 1.6693 | 0.4886 | 0.1862 | |
|
| 0.244 | 615.0 | 49200 | 1.6203 | 0.4765 | 0.1792 | |
|
| 0.2388 | 630.0 | 50400 | 1.6829 | 0.4830 | 0.1828 | |
|
| 0.2362 | 645.0 | 51600 | 1.8100 | 0.4928 | 0.1888 | |
|
| 0.2224 | 660.0 | 52800 | 1.7746 | 0.4932 | 0.1899 | |
|
| 0.2218 | 675.0 | 54000 | 1.7752 | 0.4946 | 0.1901 | |
|
| 0.2201 | 690.0 | 55200 | 1.6775 | 0.4788 | 0.1844 | |
|
| 0.2147 | 705.0 | 56400 | 1.7085 | 0.4844 | 0.1851 | |
|
| 0.2103 | 720.0 | 57600 | 1.7624 | 0.4848 | 0.1864 | |
|
| 0.2101 | 735.0 | 58800 | 1.7213 | 0.4783 | 0.1835 | |
|
| 0.1983 | 750.0 | 60000 | 1.7452 | 0.4848 | 0.1856 | |
|
| 0.2015 | 765.0 | 61200 | 1.7525 | 0.4872 | 0.1869 | |
|
| 0.1969 | 780.0 | 62400 | 1.7443 | 0.4844 | 0.1852 | |
|
| 0.2043 | 795.0 | 63600 | 1.7302 | 0.4825 | 0.1847 | |
|
|
|
|
|
### Framework versions |
|
|
|
- Transformers 4.16.2 |
|
- Pytorch 1.10.1+cu102 |
|
- Datasets 1.18.3 |
|
- Tokenizers 0.11.0 |
|
|