Edit model card

WAVLM_TITML_IDN_model

This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m on the audiofolder dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7585
  • Accuracy: 0.8181

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 30

Training results

Training Loss Epoch Step Validation Loss Accuracy
8.0217 0.98 31 7.7416 0.0472
5.1076 2.0 63 3.5170 0.0472
3.0131 2.98 94 2.9921 0.0876
3.0119 4.0 126 2.9580 0.0928
2.685 4.98 157 2.6591 0.0793
2.4513 6.0 189 2.3831 0.1257
2.4415 6.98 220 2.3518 0.1415
2.2998 8.0 252 2.2327 0.1864
2.1987 8.98 283 2.1297 0.1549
2.1206 10.0 315 2.0529 0.2118
2.0542 10.98 346 1.9592 0.2507
1.9693 12.0 378 1.8652 0.2792
1.8677 12.98 409 1.7811 0.3668
1.7369 14.0 441 1.7902 0.2493
1.6551 14.98 472 1.6558 0.3406
1.6176 16.0 504 1.5724 0.3585
1.5666 16.98 535 1.5822 0.4207
1.5103 18.0 567 1.5028 0.4379
1.4695 18.98 598 1.4276 0.4970
1.3016 20.0 630 1.3621 0.4798
1.2025 20.98 661 1.2016 0.5778
1.1211 22.0 693 1.2346 0.5644
1.0204 22.98 724 1.0743 0.6445
0.9365 24.0 756 1.0121 0.6759
0.8553 24.98 787 0.9246 0.7290
0.7698 26.0 819 0.8603 0.7612
0.7336 26.98 850 0.8072 0.7867
0.6965 28.0 882 0.7770 0.8009
0.6662 28.98 913 0.7640 0.8136
0.63 29.52 930 0.7585 0.8181

Framework versions

  • Transformers 4.34.1
  • Pytorch 2.1.0+cu118
  • Datasets 2.14.6
  • Tokenizers 0.14.1
Downloads last month
31

Finetuned from

Evaluation results