File size: 7,189 Bytes

---
language:
- fr
license: apache-2.0
tags:
- automatic-speech-recognition
- mozilla-foundation/common_voice_9_0
- generated_from_trainer
- hf-asr-leaderboard
- robust-speech-event
datasets:
- mozilla-foundation/common_voice_9_0
model-index:
- name: Fine-tuned Wav2Vec2 XLS-R 1B model for ASR in French
  results:
  - task:
      name: Automatic Speech Recognition
      type: automatic-speech-recognition
    dataset:
      name: Common Voice 9
      type: mozilla-foundation/common_voice_9_0
      args: fr
    metrics:
    - name: Test WER
      type: wer
      value: 12.72
    - name: Test CER
      type: cer
      value: 3.78
    - name: Test WER (+LM)
      type: wer
      value: 10.60
    - name: Test CER (+LM)
      type: cer
      value: 3.41
  - task:
      name: Automatic Speech Recognition
      type: automatic-speech-recognition
    dataset:
      name: Robust Speech Event - Dev Data
      type: speech-recognition-community-v2/dev_data
      args: fr
    metrics:
    - name: Test WER
      type: wer
      value: 24.28
    - name: Test CER
      type: cer
      value: 11.46
    - name: Test WER (+LM)
      type: wer
      value: 20.85
    - name: Test CER (+LM)
      type: cer
      value: 11.09
---


# Fine-tuned Wav2Vec2 XLS-R 1B model for ASR in French

This model is a fine-tuned version of [facebook/wav2vec2-xls-r-1b](https://huggingface.co/facebook/wav2vec2-xls-r-1b) on the MOZILLA-FOUNDATION/COMMON_VOICE_9_0 - FR dataset.
It achieves the following results on the evaluation set:
- Loss: 0.1430
- Wer: 0.1245

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 16
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 8
- total_train_batch_size: 128
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 10.0
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch | Step  | Validation Loss | Wer    |
|:-------------:|:-----:|:-----:|:---------------:|:------:|
| 0.9229        | 0.14  | 500   | 0.5049          | 0.4008 |
| 0.3823        | 0.28  | 1000  | 0.2831          | 0.2297 |
| 0.3079        | 0.42  | 1500  | 0.2385          | 0.1951 |
| 0.2899        | 0.55  | 2000  | 0.2273          | 0.1978 |
| 0.2795        | 0.69  | 2500  | 0.2329          | 0.1983 |
| 0.2863        | 0.83  | 3000  | 0.2289          | 0.1991 |
| 0.3063        | 0.97  | 3500  | 0.2370          | 0.2046 |
| 0.2766        | 1.11  | 4000  | 0.2322          | 0.2021 |
| 0.2749        | 1.25  | 4500  | 0.2332          | 0.2055 |
| 0.2769        | 1.39  | 5000  | 0.2322          | 0.2035 |
| 0.2628        | 1.53  | 5500  | 0.2242          | 0.1948 |
| 0.2614        | 1.66  | 6000  | 0.2303          | 0.1962 |
| 0.2547        | 1.8   | 6500  | 0.2238          | 0.1920 |
| 0.2458        | 1.94  | 7000  | 0.2186          | 0.1894 |
| 0.231         | 2.08  | 7500  | 0.2169          | 0.1895 |
| 0.2309        | 2.22  | 8000  | 0.2131          | 0.1870 |
| 0.2258        | 2.36  | 8500  | 0.2133          | 0.1818 |
| 0.2278        | 2.5   | 9000  | 0.2176          | 0.1878 |
| 0.2263        | 2.63  | 9500  | 0.2030          | 0.1813 |
| 0.2262        | 2.77  | 10000 | 0.2077          | 0.1824 |
| 0.2228        | 2.91  | 10500 | 0.2115          | 0.1840 |
| 0.2118        | 3.05  | 11000 | 0.2093          | 0.1782 |
| 0.2073        | 3.19  | 11500 | 0.2004          | 0.1756 |
| 0.2015        | 3.33  | 12000 | 0.1988          | 0.1748 |
| 0.214         | 3.47  | 12500 | 0.2088          | 0.1816 |
| 0.2075        | 3.61  | 13000 | 0.1976          | 0.1746 |
| 0.2039        | 3.74  | 13500 | 0.1958          | 0.1744 |
| 0.2003        | 3.88  | 14000 | 0.1931          | 0.1693 |
| 0.1886        | 4.02  | 14500 | 0.1964          | 0.1686 |
| 0.1943        | 4.16  | 15000 | 0.1986          | 0.1746 |
| 0.1919        | 4.3   | 15500 | 0.1957          | 0.1700 |
| 0.1857        | 4.44  | 16000 | 0.1907          | 0.1671 |
| 0.1834        | 4.58  | 16500 | 0.1877          | 0.1641 |
| 0.18          | 4.71  | 17000 | 0.1828          | 0.1600 |
| 0.1774        | 4.85  | 17500 | 0.1863          | 0.1605 |
| 0.1755        | 4.99  | 18000 | 0.1833          | 0.1595 |
| 0.1692        | 5.13  | 18500 | 0.1814          | 0.1569 |
| 0.1674        | 5.27  | 19000 | 0.1819          | 0.1566 |
| 0.1664        | 5.41  | 19500 | 0.1805          | 0.1572 |
| 0.1677        | 5.55  | 20000 | 0.1803          | 0.1560 |
| 0.1637        | 5.68  | 20500 | 0.1750          | 0.1525 |
| 0.1628        | 5.82  | 21000 | 0.1774          | 0.1532 |
| 0.1645        | 5.96  | 21500 | 0.1744          | 0.1527 |
| 0.1551        | 6.1   | 22000 | 0.1778          | 0.1543 |
| 0.1505        | 6.24  | 22500 | 0.1754          | 0.1528 |
| 0.1499        | 6.38  | 23000 | 0.1743          | 0.1500 |
| 0.1491        | 6.52  | 23500 | 0.1684          | 0.1473 |
| 0.1477        | 6.66  | 24000 | 0.1661          | 0.1472 |
| 0.1456        | 6.79  | 24500 | 0.1654          | 0.1440 |
| 0.1415        | 6.93  | 25000 | 0.1654          | 0.1448 |
| 0.136         | 7.07  | 25500 | 0.1616          | 0.1407 |
| 0.132         | 7.21  | 26000 | 0.1625          | 0.1410 |
| 0.1323        | 7.35  | 26500 | 0.1604          | 0.1404 |
| 0.1338        | 7.49  | 27000 | 0.1574          | 0.1386 |
| 0.13          | 7.63  | 27500 | 0.1576          | 0.1384 |
| 0.1291        | 7.76  | 28000 | 0.1551          | 0.1366 |
| 0.1277        | 7.9   | 28500 | 0.1542          | 0.1356 |
| 0.1241        | 8.04  | 29000 | 0.1545          | 0.1350 |
| 0.1198        | 8.18  | 29500 | 0.1536          | 0.1322 |
| 0.1204        | 8.32  | 30000 | 0.1547          | 0.1337 |
| 0.1195        | 8.46  | 30500 | 0.1494          | 0.1309 |
| 0.1169        | 8.6   | 31000 | 0.1490          | 0.1300 |
| 0.1159        | 8.74  | 31500 | 0.1485          | 0.1305 |
| 0.1142        | 8.87  | 32000 | 0.1479          | 0.1292 |
| 0.1087        | 9.01  | 32500 | 0.1471          | 0.1284 |
| 0.1076        | 9.15  | 33000 | 0.1467          | 0.1270 |
| 0.1078        | 9.29  | 33500 | 0.1467          | 0.1270 |
| 0.1073        | 9.43  | 34000 | 0.1447          | 0.1256 |
| 0.108         | 9.57  | 34500 | 0.1447          | 0.1257 |
| 0.106         | 9.71  | 35000 | 0.1438          | 0.1255 |
| 0.1052        | 9.84  | 35500 | 0.1428          | 0.1247 |
| 0.1044        | 9.98  | 36000 | 0.1430          | 0.1245 |

## Evaluation

1. To evaluate on `mozilla-foundation/common_voice_9_0`

```bash
python eval.py \
  --model_id "bhuang/wav2vec2-xls-r-1b-french" \
  --dataset "mozilla-foundation/common_voice_9_0" \
  --config "fr" \
  --split "test" \
  --log_outputs
```

2. To evaluate on `speech-recognition-community-v2/dev_data`

```bash
python eval.py \
  --model_id "bhuang/wav2vec2-xls-r-1b-french" \
  --dataset "speech-recognition-community-v2/dev_data" \
  --config "fr" \
  --split "validation" \
  --chunk_length_s 5.0 \
  --stride_length_s 1.0 \
  --log_outputs
```

### Framework versions

- Transformers 4.22.0.dev0
- Pytorch 1.12.0+cu113
- Datasets 2.4.0
- Tokenizers 0.12.1