Edit model card

Wav2Vec2-XLS-TR

This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m on the Common Voice 17 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5348

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 30
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
21.9538 0.3446 500 12.1320
7.2139 0.6892 1000 5.5372
4.9229 1.0338 1500 4.4389
4.0539 1.3784 2000 3.7210
3.5019 1.7229 2500 3.3480
3.2272 2.0675 3000 3.1302
2.9163 2.4121 3500 2.5875
1.902 2.7567 4000 1.5471
1.2034 3.1013 4500 1.1319
0.9389 3.4459 5000 1.0412
0.7599 3.7905 5500 0.8486
0.6418 4.1351 6000 0.7179
0.569 4.4797 6500 0.7109
0.5248 4.8243 7000 0.6470
0.4752 5.1688 7500 0.6298
0.4461 5.5134 8000 0.6198
0.4187 5.8580 8500 0.6224
0.3935 6.2026 9000 0.6116
0.3756 6.5472 9500 0.5536
0.3597 6.8918 10000 0.5263
0.3483 7.2364 10500 0.5179
0.3283 7.5810 11000 0.5054
0.3204 7.9256 11500 0.5200
0.3031 8.2702 12000 0.4984
0.2986 8.6147 12500 0.4846
0.2936 8.9593 13000 0.4984
0.2789 9.3039 13500 0.4888
0.2724 9.6485 14000 0.4654
0.2718 9.9931 14500 0.4553
0.2533 10.3377 15000 0.4506
0.2498 10.6823 15500 0.4983
0.2501 11.0269 16000 0.4835
0.2384 11.3715 16500 0.4749
0.2371 11.7161 17000 0.4877
0.2319 12.0606 17500 0.4806
0.2235 12.4052 18000 0.4874
0.2218 12.7498 18500 0.4565
0.2193 13.0944 19000 0.4733
0.2154 13.4390 19500 0.4747
0.2157 13.7836 20000 0.4694
0.2067 14.1282 20500 0.4848
0.2044 14.4728 21000 0.5092
0.2023 14.8174 21500 0.4752
0.1975 15.1620 22000 0.4852
0.1903 15.5065 22500 0.4891
0.1922 15.8511 23000 0.4825
0.1901 16.1957 23500 0.4836
0.1862 16.5403 24000 0.4838
0.1842 16.8849 24500 0.4897
0.1837 17.2295 25000 0.4920
0.1782 17.5741 25500 0.4937
0.1775 17.9187 26000 0.4868
0.1738 18.2633 26500 0.5107
0.1739 18.6079 27000 0.4943
0.1722 18.9524 27500 0.4740
0.1691 19.2970 28000 0.4965
0.1684 19.6416 28500 0.4834
0.1646 19.9862 29000 0.5187
0.1637 20.3308 29500 0.4979
0.1625 20.6754 30000 0.4945
0.1608 21.0200 30500 0.5149
0.1567 21.3646 31000 0.5030
0.1571 21.7092 31500 0.5013
0.1589 22.0538 32000 0.5269
0.1524 22.3983 32500 0.5191
0.1491 22.7429 33000 0.5200
0.1521 23.0875 33500 0.5206
0.149 23.4321 34000 0.5214
0.1505 23.7767 34500 0.5255
0.1516 24.1213 35000 0.5168
0.1479 24.4659 35500 0.5431
0.1476 24.8105 36000 0.5352
0.1433 25.1551 36500 0.5281
0.1464 25.4997 37000 0.5207
0.1444 25.8442 37500 0.5254
0.1466 26.1888 38000 0.5163
0.1424 26.5334 38500 0.5200
0.1375 26.8780 39000 0.5188
0.1428 27.2226 39500 0.5315
0.1376 27.5672 40000 0.5323
0.139 27.9118 40500 0.5370
0.1437 28.2564 41000 0.5426
0.138 28.6010 41500 0.5263
0.1382 28.9456 42000 0.5286
0.139 29.2901 42500 0.5302
0.139 29.6347 43000 0.5334
0.1356 29.9793 43500 0.5348

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.3.1+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
9
Safetensors
Model size
315M params
Tensor type
F32
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Finetuned from