--- language: - sr license: apache-2.0 tags: - automatic-speech-recognition - mozilla-foundation/common_voice_8_0 - generated_from_trainer - robust-speech-event - xlsr-fine-tuning-week - hf-asr-leaderboard datasets: - mozilla-foundation/common_voice_8_0 - name: Serbian comodoro Wav2Vec2 XLSR 300M CV8 results: - task: name: Automatic Speech Recognition type: automatic-speech-recognition dataset: name: Common Voice 8 type: mozilla-foundation/common_voice_8_0 args: sr metrics: - name: Test WER type: wer value: 48.5 - name: Test CER type: cer value: 18.4 model-index: - name: wav2vec2-xls-r-300m-sr-cv8 results: - task: name: Automatic Speech Recognition type: automatic-speech-recognition dataset: name: Common Voice 8.0 type: mozilla-foundation/common_voice_8_0 args: sr metrics: - name: Test WER type: wer value: 48.53 - task: name: Automatic Speech Recognition type: automatic-speech-recognition dataset: name: Robust Speech Event - Dev Data type: speech-recognition-community-v2/dev_data args: sr metrics: - name: Test WER type: wer value: 97.43 - task: name: Automatic Speech Recognition type: automatic-speech-recognition dataset: name: Robust Speech Event - Test Data type: speech-recognition-community-v2/eval_data args: sr metrics: - name: Test WER type: wer value: 96.69 --- # Serbian wav2vec2-xls-r-300m-sr-cv8 This model is a fine-tuned version of [facebook/wav2vec2-xls-r-300m](https://huggingface.co/facebook/wav2vec2-xls-r-300m) on the common_voice dataset. It achieves the following results on the evaluation set: - Loss: 1.7302 - Wer: 0.4825 - Cer: 0.1847 Evaluation on mozilla-foundation/common_voice_8_0 gave the following results: - WER: 0.48530097993467103 - CER: 0.18413288165227845 Evaluation on speech-recognition-community-v2/dev_data gave the following results: - WER: 0.9718373107518604 - CER: 0.8302740620263108 The model can be evaluated using the attached `eval.py` script: ``` python eval.py --model_id comodoro/wav2vec2-xls-r-300m-sr-cv8 --dataset mozilla-foundation/common-voice_8_0 --split test --config sr ``` ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.0001 - train_batch_size: 16 - eval_batch_size: 8 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - lr_scheduler_warmup_steps: 300 - num_epochs: 800 - mixed_precision_training: Native AMP ### Training results | Training Loss | Epoch | Step | Validation Loss | Wer | Cer | |:-------------:|:-----:|:-----:|:---------------:|:------:|:------:| | 5.6536 | 15.0 | 1200 | 2.9744 | 1.0 | 1.0 | | 2.7935 | 30.0 | 2400 | 1.6613 | 0.8998 | 0.4670 | | 1.6538 | 45.0 | 3600 | 0.9248 | 0.6918 | 0.2699 | | 1.2446 | 60.0 | 4800 | 0.9151 | 0.6452 | 0.2398 | | 1.0766 | 75.0 | 6000 | 0.9110 | 0.5995 | 0.2207 | | 0.9548 | 90.0 | 7200 | 1.0273 | 0.5921 | 0.2149 | | 0.8919 | 105.0 | 8400 | 0.9929 | 0.5646 | 0.2117 | | 0.8185 | 120.0 | 9600 | 1.0850 | 0.5483 | 0.2069 | | 0.7692 | 135.0 | 10800 | 1.1001 | 0.5394 | 0.2055 | | 0.7249 | 150.0 | 12000 | 1.1018 | 0.5380 | 0.1958 | | 0.6786 | 165.0 | 13200 | 1.1344 | 0.5114 | 0.1941 | | 0.6432 | 180.0 | 14400 | 1.1516 | 0.5054 | 0.1905 | | 0.6009 | 195.0 | 15600 | 1.3149 | 0.5324 | 0.1991 | | 0.5773 | 210.0 | 16800 | 1.2468 | 0.5124 | 0.1903 | | 0.559 | 225.0 | 18000 | 1.2186 | 0.4956 | 0.1922 | | 0.5298 | 240.0 | 19200 | 1.4483 | 0.5333 | 0.2085 | | 0.5136 | 255.0 | 20400 | 1.2871 | 0.4802 | 0.1846 | | 0.4824 | 270.0 | 21600 | 1.2891 | 0.4974 | 0.1885 | | 0.4669 | 285.0 | 22800 | 1.3283 | 0.4942 | 0.1878 | | 0.4511 | 300.0 | 24000 | 1.4502 | 0.5002 | 0.1994 | | 0.4337 | 315.0 | 25200 | 1.4714 | 0.5035 | 0.1911 | | 0.4221 | 330.0 | 26400 | 1.4971 | 0.5124 | 0.1962 | | 0.3994 | 345.0 | 27600 | 1.4473 | 0.5007 | 0.1920 | | 0.3892 | 360.0 | 28800 | 1.3904 | 0.4937 | 0.1887 | | 0.373 | 375.0 | 30000 | 1.4971 | 0.4946 | 0.1902 | | 0.3657 | 390.0 | 31200 | 1.4208 | 0.4900 | 0.1821 | | 0.3559 | 405.0 | 32400 | 1.4648 | 0.4895 | 0.1835 | | 0.3476 | 420.0 | 33600 | 1.4848 | 0.4946 | 0.1829 | | 0.3276 | 435.0 | 34800 | 1.5597 | 0.4979 | 0.1873 | | 0.3193 | 450.0 | 36000 | 1.7329 | 0.5040 | 0.1980 | | 0.3078 | 465.0 | 37200 | 1.6379 | 0.4937 | 0.1882 | | 0.3058 | 480.0 | 38400 | 1.5878 | 0.4942 | 0.1921 | | 0.2987 | 495.0 | 39600 | 1.5590 | 0.4811 | 0.1846 | | 0.2931 | 510.0 | 40800 | 1.6001 | 0.4825 | 0.1849 | | 0.276 | 525.0 | 42000 | 1.7388 | 0.4942 | 0.1918 | | 0.2702 | 540.0 | 43200 | 1.7037 | 0.4839 | 0.1866 | | 0.2619 | 555.0 | 44400 | 1.6704 | 0.4755 | 0.1840 | | 0.262 | 570.0 | 45600 | 1.6042 | 0.4751 | 0.1865 | | 0.2528 | 585.0 | 46800 | 1.6402 | 0.4821 | 0.1865 | | 0.2442 | 600.0 | 48000 | 1.6693 | 0.4886 | 0.1862 | | 0.244 | 615.0 | 49200 | 1.6203 | 0.4765 | 0.1792 | | 0.2388 | 630.0 | 50400 | 1.6829 | 0.4830 | 0.1828 | | 0.2362 | 645.0 | 51600 | 1.8100 | 0.4928 | 0.1888 | | 0.2224 | 660.0 | 52800 | 1.7746 | 0.4932 | 0.1899 | | 0.2218 | 675.0 | 54000 | 1.7752 | 0.4946 | 0.1901 | | 0.2201 | 690.0 | 55200 | 1.6775 | 0.4788 | 0.1844 | | 0.2147 | 705.0 | 56400 | 1.7085 | 0.4844 | 0.1851 | | 0.2103 | 720.0 | 57600 | 1.7624 | 0.4848 | 0.1864 | | 0.2101 | 735.0 | 58800 | 1.7213 | 0.4783 | 0.1835 | | 0.1983 | 750.0 | 60000 | 1.7452 | 0.4848 | 0.1856 | | 0.2015 | 765.0 | 61200 | 1.7525 | 0.4872 | 0.1869 | | 0.1969 | 780.0 | 62400 | 1.7443 | 0.4844 | 0.1852 | | 0.2043 | 795.0 | 63600 | 1.7302 | 0.4825 | 0.1847 | ### Framework versions - Transformers 4.16.2 - Pytorch 1.10.1+cu102 - Datasets 1.18.3 - Tokenizers 0.11.0