Edit model card

wav2vec2-base-finetuned-spgispeech-dev

This model is a fine-tuned version of facebook/wav2vec2-base on the kensho/spgispeech dev dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2897
  • Wer: 0.1508

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 1000
  • num_epochs: 50
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
1.8285 2.22 1500 0.3361 0.2754
0.2582 4.44 3000 0.2643 0.2205
0.1697 6.66 4500 0.2467 0.2006
0.1314 8.88 6000 0.2711 0.1927
0.1084 11.09 7500 0.2521 0.1872
0.0922 13.31 9000 0.2588 0.1827
0.0818 15.53 10500 0.2572 0.1783
0.0712 17.75 12000 0.2720 0.1766
0.067 19.97 13500 0.2873 0.1751
0.0594 22.19 15000 0.2753 0.1704
0.0546 24.41 16500 0.2794 0.1694
0.0505 26.63 18000 0.2811 0.1665
0.0467 28.85 19500 0.2906 0.1657
0.0417 31.07 21000 0.3043 0.1661
0.0395 33.28 22500 0.3068 0.1627
0.0368 35.5 24000 0.3096 0.1617
0.0334 37.72 25500 0.3036 0.1581
0.0322 39.94 27000 0.2819 0.1564
0.0286 42.16 28500 0.2936 0.1544
0.0279 44.38 30000 0.2914 0.1534
0.0264 46.6 31500 0.2957 0.1519
0.0241 48.82 33000 0.2897 0.1508

Framework versions

  • Transformers 4.17.0
  • Pytorch 1.12.1+cu113
  • Datasets 2.4.0
  • Tokenizers 0.12.1
Downloads last month
2

Dataset used to train nickmuchi/wav2vec2-base-finetuned-spgispeech-dev