wav2vec2-Irish-common-voice-Fleurs-living-audio-300m

This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m on the GOOGLE/FLEURS - GA-IE, Common Voice Irish (Validated - (minus) Test) and Living audio Irish Speech dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3361
  • Wer: 0.1963

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 3
  • total_train_batch_size: 24
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 18.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
No log 0.56 200 2.8832 1.0
No log 1.11 400 1.1705 0.7788
3.3987 1.67 600 0.7739 0.5895
3.3987 2.23 800 0.6045 0.4902
0.8313 2.78 1000 0.5235 0.4394
0.8313 3.34 1200 0.4824 0.4002
0.8313 3.9 1400 0.4378 0.3754
0.5342 4.46 1600 0.4433 0.3634
0.5342 5.01 1800 0.4103 0.3485
0.3792 5.57 2000 0.3816 0.3310
0.3792 6.13 2200 0.3953 0.3225
0.3792 6.68 2400 0.3995 0.3132
0.2924 7.24 2600 0.3907 0.2930
0.2924 7.8 2800 0.3517 0.2740
0.2217 8.36 3000 0.3361 0.2591
0.2217 8.91 3200 0.3340 0.2451
0.2217 9.47 3400 0.3126 0.2448
0.1714 10.03 3600 0.3441 0.2556
0.1714 10.58 3800 0.3404 0.2521
0.1395 11.14 4000 0.3728 0.2518
0.1395 11.7 4200 0.3829 0.2396
0.1395 12.26 4400 0.3466 0.2361
0.1069 12.81 4600 0.3188 0.2241
0.1069 13.37 4800 0.3396 0.2197
0.0845 13.93 5000 0.3365 0.2206
0.0845 14.48 5200 0.3459 0.2209
0.0845 15.04 5400 0.3429 0.2194
0.0675 15.6 5600 0.3434 0.2182
0.0675 16.16 5800 0.3434 0.2083
0.0561 16.71 6000 0.3375 0.2036
0.0561 17.27 6200 0.3446 0.1987
0.0561 17.83 6400 0.3362 0.1978

Framework versions

  • Transformers 4.24.0
  • Pytorch 1.13.0+cu117
  • Datasets 2.7.1
  • Tokenizers 0.13.2
Downloads last month
5
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train Aditya3107/wav2vec2-Irish-common-voice-Fleurs-living-audio-300m

Evaluation results

  • Test WER (Without LM) on Common Voice 10.0
    self-reported
    19.980
  • Test WER (With LM) on Common Voice 10.0
    self-reported
    13.870
  • Common Voice Irish Invalidated 281 utterances (Without LM) on Common Voice 10.0
    self-reported
    39.190
  • Common Voice Irish Invalidated 281 utterances (With LM) on Common Voice 10.0
    self-reported
    30.850