arampacha's picture
iter 2
2d9955e
metadata
license: apache-2.0
tags:
  - automatic-speech-recognition
  - /workspace/data/hy/noizy_student_2/
  - generated_from_trainer
model-index:
  - name: ''
    results: []

This model is a fine-tuned version of facebook/wav2vec2-xls-r-1b on the /WORKSPACE/DATA/HY/NOIZY_STUDENT_2/ - NA dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2249
  • Wer: 0.2783
  • Cer: 0.0508

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 8e-05
  • train_batch_size: 16
  • eval_batch_size: 64
  • seed: 842
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.98) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • training_steps: 1600
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Cer
4.9923 3.84 100 3.1562 1.0 1.0
2.1775 7.69 200 0.4334 0.5804 0.1122
1.3708 11.53 300 0.3106 0.4336 0.0797
1.2266 15.38 400 0.2675 0.3673 0.0673
1.093 19.23 500 0.2416 0.3501 0.0633
0.989 23.08 600 0.2320 0.3251 0.0611
0.9518 26.91 700 0.2413 0.3193 0.0584
0.9075 30.76 800 0.2354 0.3201 0.0593
0.878 34.61 900 0.2278 0.3126 0.0579
0.8563 38.46 1000 0.2327 0.2963 0.0548
0.8084 42.3 1100 0.2271 0.2923 0.0541
0.7845 46.15 1200 0.2333 0.2951 0.0537
0.7487 49.99 1300 0.2290 0.2888 0.0525
0.7182 53.84 1400 0.2341 0.2877 0.0535
0.7095 57.69 1500 0.2291 0.2818 0.0515
0.6953 61.53 1600 0.2249 0.2783 0.0508

Framework versions

  • Transformers 4.17.0.dev0
  • Pytorch 1.10.2+cu102
  • Datasets 1.18.2.dev0
  • Tokenizers 0.11.0