---
language:
- sr
license: apache-2.0
tags:
- automatic-speech-recognition
- mozilla-foundation/common_voice_8_0
- generated_from_trainer
- robust-speech-event
- xlsr-fine-tuning-week
- hf-asr-leaderboard
datasets:
- mozilla-foundation/common_voice_8_0
- name: Serbian comodoro Wav2Vec2 XLSR 300M CV8
  results:
  - task:
      name: Automatic Speech Recognition
      type: automatic-speech-recognition
    dataset:
      name: Common Voice 8
      type: mozilla-foundation/common_voice_8_0
      args: sr
    metrics:
    - name: Test WER
      type: wer
      value: 48.5
    - name: Test CER
      type: cer
      value: 18.4
model-index:
- name: wav2vec2-xls-r-300m-sr-cv8
  results:
  - task:
      name: Automatic Speech Recognition
      type: automatic-speech-recognition
    dataset:
      name: Common Voice 8.0
      type: mozilla-foundation/common_voice_8_0
      args: sr
    metrics:
    - name: Test WER
      type: wer
      value: 48.53
  - task:
      name: Automatic Speech Recognition
      type: automatic-speech-recognition
    dataset:
      name: Robust Speech Event - Dev Data
      type: speech-recognition-community-v2/dev_data
      args: sr
    metrics:
    - name: Test WER
      type: wer
      value: 97.43
  - task:
      name: Automatic Speech Recognition
      type: automatic-speech-recognition
    dataset:
      name: Robust Speech Event - Test Data
      type: speech-recognition-community-v2/eval_data
      args: sr
    metrics:
    - name: Test WER
      type: wer
      value: 96.69
---

# Serbian wav2vec2-xls-r-300m-sr-cv8

This model is a fine-tuned version of [facebook/wav2vec2-xls-r-300m](https://huggingface.co/facebook/wav2vec2-xls-r-300m) on the common_voice dataset.
It achieves the following results on the evaluation set:
- Loss: 1.7302
- Wer: 0.4825
- Cer: 0.1847

Evaluation on mozilla-foundation/common_voice_8_0 gave the following results:

- WER: 0.48530097993467103
- CER: 0.18413288165227845

Evaluation on  speech-recognition-community-v2/dev_data gave the following results:

- WER: 0.9718373107518604
- CER: 0.8302740620263108

The model can be evaluated using the attached `eval.py` script:
```
python eval.py --model_id comodoro/wav2vec2-xls-r-300m-sr-cv8 --dataset mozilla-foundation/common-voice_8_0 --split test --config sr
```

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 16
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 300
- num_epochs: 800
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch | Step  | Validation Loss | Wer    | Cer    |
|:-------------:|:-----:|:-----:|:---------------:|:------:|:------:|
| 5.6536        | 15.0  | 1200  | 2.9744          | 1.0    | 1.0    |
| 2.7935        | 30.0  | 2400  | 1.6613          | 0.8998 | 0.4670 |
| 1.6538        | 45.0  | 3600  | 0.9248          | 0.6918 | 0.2699 |
| 1.2446        | 60.0  | 4800  | 0.9151          | 0.6452 | 0.2398 |
| 1.0766        | 75.0  | 6000  | 0.9110          | 0.5995 | 0.2207 |
| 0.9548        | 90.0  | 7200  | 1.0273          | 0.5921 | 0.2149 |
| 0.8919        | 105.0 | 8400  | 0.9929          | 0.5646 | 0.2117 |
| 0.8185        | 120.0 | 9600  | 1.0850          | 0.5483 | 0.2069 |
| 0.7692        | 135.0 | 10800 | 1.1001          | 0.5394 | 0.2055 |
| 0.7249        | 150.0 | 12000 | 1.1018          | 0.5380 | 0.1958 |
| 0.6786        | 165.0 | 13200 | 1.1344          | 0.5114 | 0.1941 |
| 0.6432        | 180.0 | 14400 | 1.1516          | 0.5054 | 0.1905 |
| 0.6009        | 195.0 | 15600 | 1.3149          | 0.5324 | 0.1991 |
| 0.5773        | 210.0 | 16800 | 1.2468          | 0.5124 | 0.1903 |
| 0.559         | 225.0 | 18000 | 1.2186          | 0.4956 | 0.1922 |
| 0.5298        | 240.0 | 19200 | 1.4483          | 0.5333 | 0.2085 |
| 0.5136        | 255.0 | 20400 | 1.2871          | 0.4802 | 0.1846 |
| 0.4824        | 270.0 | 21600 | 1.2891          | 0.4974 | 0.1885 |
| 0.4669        | 285.0 | 22800 | 1.3283          | 0.4942 | 0.1878 |
| 0.4511        | 300.0 | 24000 | 1.4502          | 0.5002 | 0.1994 |
| 0.4337        | 315.0 | 25200 | 1.4714          | 0.5035 | 0.1911 |
| 0.4221        | 330.0 | 26400 | 1.4971          | 0.5124 | 0.1962 |
| 0.3994        | 345.0 | 27600 | 1.4473          | 0.5007 | 0.1920 |
| 0.3892        | 360.0 | 28800 | 1.3904          | 0.4937 | 0.1887 |
| 0.373         | 375.0 | 30000 | 1.4971          | 0.4946 | 0.1902 |
| 0.3657        | 390.0 | 31200 | 1.4208          | 0.4900 | 0.1821 |
| 0.3559        | 405.0 | 32400 | 1.4648          | 0.4895 | 0.1835 |
| 0.3476        | 420.0 | 33600 | 1.4848          | 0.4946 | 0.1829 |
| 0.3276        | 435.0 | 34800 | 1.5597          | 0.4979 | 0.1873 |
| 0.3193        | 450.0 | 36000 | 1.7329          | 0.5040 | 0.1980 |
| 0.3078        | 465.0 | 37200 | 1.6379          | 0.4937 | 0.1882 |
| 0.3058        | 480.0 | 38400 | 1.5878          | 0.4942 | 0.1921 |
| 0.2987        | 495.0 | 39600 | 1.5590          | 0.4811 | 0.1846 |
| 0.2931        | 510.0 | 40800 | 1.6001          | 0.4825 | 0.1849 |
| 0.276         | 525.0 | 42000 | 1.7388          | 0.4942 | 0.1918 |
| 0.2702        | 540.0 | 43200 | 1.7037          | 0.4839 | 0.1866 |
| 0.2619        | 555.0 | 44400 | 1.6704          | 0.4755 | 0.1840 |
| 0.262         | 570.0 | 45600 | 1.6042          | 0.4751 | 0.1865 |
| 0.2528        | 585.0 | 46800 | 1.6402          | 0.4821 | 0.1865 |
| 0.2442        | 600.0 | 48000 | 1.6693          | 0.4886 | 0.1862 |
| 0.244         | 615.0 | 49200 | 1.6203          | 0.4765 | 0.1792 |
| 0.2388        | 630.0 | 50400 | 1.6829          | 0.4830 | 0.1828 |
| 0.2362        | 645.0 | 51600 | 1.8100          | 0.4928 | 0.1888 |
| 0.2224        | 660.0 | 52800 | 1.7746          | 0.4932 | 0.1899 |
| 0.2218        | 675.0 | 54000 | 1.7752          | 0.4946 | 0.1901 |
| 0.2201        | 690.0 | 55200 | 1.6775          | 0.4788 | 0.1844 |
| 0.2147        | 705.0 | 56400 | 1.7085          | 0.4844 | 0.1851 |
| 0.2103        | 720.0 | 57600 | 1.7624          | 0.4848 | 0.1864 |
| 0.2101        | 735.0 | 58800 | 1.7213          | 0.4783 | 0.1835 |
| 0.1983        | 750.0 | 60000 | 1.7452          | 0.4848 | 0.1856 |
| 0.2015        | 765.0 | 61200 | 1.7525          | 0.4872 | 0.1869 |
| 0.1969        | 780.0 | 62400 | 1.7443          | 0.4844 | 0.1852 |
| 0.2043        | 795.0 | 63600 | 1.7302          | 0.4825 | 0.1847 |


### Framework versions

- Transformers 4.16.2
- Pytorch 1.10.1+cu102
- Datasets 1.18.3
- Tokenizers 0.11.0