patrickvonplaten's picture
Update README.md
8307de0
metadata
license: apache-2.0
tags:
  - automatic-speech-recognition
  - google/xtreme_s
  - generated_from_trainer
datasets:
  - google/xtreme_s
model-index:
  - name: xtreme_s_xlsr_mls
    results: []

xtreme_s_xlsr_300m_mls

This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m on the GOOGLE/XTREME_S - MLS dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6215
  • Wer: 0.3033
  • Cer: 0.0951

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 4
  • eval_batch_size: 1
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 8
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 64
  • total_eval_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 3000
  • num_epochs: 100.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Cer
3.0446 1.91 500 2.9866 1.0 1.0
0.8789 3.82 1000 0.8574 0.7225 0.2355
0.4766 5.72 1500 0.4813 0.4624 0.1394
0.3779 7.63 2000 0.4465 0.4154 0.1309
0.3244 9.54 2500 0.4213 0.3683 0.1163
0.346 11.45 3000 0.4606 0.4033 0.1299
0.3092 13.36 3500 0.4160 0.3585 0.1115
0.3287 15.27 4000 0.4364 0.3631 0.1165
0.3165 17.18 4500 0.4218 0.3451 0.1056
0.2874 19.08 5000 0.4583 0.3650 0.1151
0.3089 20.99 5500 0.4424 0.3485 0.1137
0.2689 22.9 6000 0.4427 0.3542 0.1128
0.234 24.81 6500 0.4204 0.3431 0.1069
0.2363 26.72 7000 0.4792 0.3689 0.1191
0.2796 28.62 7500 0.4867 0.3662 0.1154
0.2447 30.53 8000 0.4908 0.3584 0.1160
0.22 32.44 8500 0.5315 0.3626 0.1240
0.1961 34.35 9000 0.5121 0.3610 0.1168
0.1959 36.26 9500 0.5140 0.3648 0.1179
0.1748 38.17 10000 0.5464 0.3763 0.1206
0.197 40.08 10500 0.5199 0.3515 0.1128
0.2166 41.98 11000 0.5336 0.3607 0.1191
0.2078 43.89 11500 0.5389 0.3518 0.1136
0.1827 45.8 12000 0.5014 0.3287 0.1053
0.1783 47.71 12500 0.5408 0.3545 0.1121
0.1489 49.62 13000 0.5292 0.3472 0.1098
0.1665 51.53 13500 0.5052 0.3300 0.1033
0.1631 53.43 14000 0.5241 0.3362 0.1081
0.1943 55.34 14500 0.5453 0.3373 0.1076
0.1504 57.25 15000 0.5958 0.3594 0.1149
0.136 59.16 15500 0.5645 0.3367 0.1082
0.1224 61.07 16000 0.5322 0.3302 0.1039
0.1156 62.98 16500 0.5728 0.3332 0.1061
0.114 64.88 17000 0.5994 0.3410 0.1125
0.1445 66.79 17500 0.6048 0.3471 0.1098
0.1281 68.7 18000 0.5747 0.3278 0.1042
0.1233 70.61 18500 0.6021 0.3375 0.1082
0.1109 72.52 19000 0.5851 0.3188 0.1021
0.0943 74.43 19500 0.5944 0.3238 0.1033
0.1418 76.34 20000 0.5904 0.3143 0.0997
0.1317 78.24 20500 0.6291 0.3283 0.1047
0.1177 80.15 21000 0.6114 0.3190 0.1000
0.1138 82.06 21500 0.6155 0.3245 0.1023
0.1074 83.97 22000 0.6094 0.3153 0.1004
0.11 85.88 22500 0.6041 0.3141 0.0988
0.1096 87.78 23000 0.6243 0.3110 0.0986
0.1017 89.69 23500 0.6110 0.3121 0.0984
0.1015 91.6 24000 0.6385 0.3093 0.0978
0.0952 93.51 24500 0.6155 0.3036 0.0953
0.0896 95.42 25000 0.6215 0.3033 0.0951
0.0953 97.33 25500 0.6293 0.3037 0.0953
0.0834 99.24 26000 0.6302 0.3036 0.0952

Framework versions

  • Transformers 4.18.0.dev0
  • Pytorch 1.11.0+cu113
  • Datasets 1.18.4.dev0
  • Tokenizers 0.11.6