Edit model card

xty_finetune

This model is a fine-tuned version of Akashpb13/Swahili_xlsr on the ml-superb-subset dataset. It achieves the following results on the evaluation set:

  • Loss: 2.0874
  • Wer: 89.1263

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 9.6e-05
  • train_batch_size: 32
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 25
  • training_steps: 500
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
8.5823 1.0526 10 7.7265 100.0629
5.0617 2.1053 20 4.2231 100.0
3.6512 3.1579 30 3.5631 100.0
3.3211 4.2105 40 3.3414 100.0
3.2144 5.2632 50 3.2086 100.0
3.128 6.3158 60 3.1724 100.0
3.0999 7.3684 70 3.1261 100.0
3.0585 8.4211 80 3.1200 100.0
3.0318 9.4737 90 3.1001 100.0
3.0166 10.5263 100 3.0985 100.0
3.0147 11.5789 110 3.0971 100.0
3.0028 12.6316 120 3.0775 100.0
2.991 13.6842 130 3.0619 100.0
2.9692 14.7368 140 3.0477 100.0
2.9355 15.7895 150 3.0081 100.0
2.8754 16.8421 160 2.9190 100.0
2.7087 17.8947 170 2.7367 100.0
2.4346 18.9474 180 2.5043 108.5481
2.3184 20.0 190 2.3709 103.8969
2.0887 21.0526 200 2.2196 103.3941
1.9198 22.1053 210 2.1078 104.7140
1.6995 23.1579 220 2.0556 98.4287
1.6576 24.2105 230 2.0081 100.6914
1.4855 25.2632 240 1.9958 98.0515
1.3788 26.3158 250 1.9729 94.8460
1.3202 27.3684 260 1.9618 98.6172
1.2237 28.4211 270 1.9662 93.6518
1.1389 29.4737 280 1.9882 92.7090
1.0597 30.5263 290 1.9655 92.3947
1.0203 31.5789 300 1.9616 90.1948
0.9778 32.6316 310 1.9585 90.8234
0.9553 33.6842 320 1.9875 90.5091
0.895 34.7368 330 1.9913 91.3891
0.9021 35.7895 340 1.9906 90.2577
0.8105 36.8421 350 2.0182 89.4406
0.8052 37.8947 360 2.0227 89.5663
0.7484 38.9474 370 2.0539 89.0006
0.7886 40.0 380 2.0616 90.6977
0.7348 41.0526 390 2.0590 89.1892
0.7079 42.1053 400 2.0790 89.8806
0.7215 43.1579 410 2.0701 89.3149
0.6997 44.2105 420 2.0832 89.3777
0.721 45.2632 430 2.0798 89.3149
0.6609 46.3158 440 2.0834 88.4349
0.6562 47.3684 450 2.0892 89.0006
0.6418 48.4211 460 2.0878 89.3777
0.677 49.4737 470 2.0874 89.2520
0.6821 50.5263 480 2.0874 89.1263
0.6798 51.5789 490 2.0875 89.0635
0.7188 52.6316 500 2.0874 89.1263

Framework versions

  • Transformers 4.41.1
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
8
Safetensors
Model size
315M params
Tensor type
F32
·

Finetuned from

Evaluation results