aradia-ctc-distilhubert-ft

This model is a fine-tuned version of ntu-spml/distilhubert on the ABDUSAHMBZUAI/ARABIC_SPEECH_MASSIVE_SM - NA dataset. It achieves the following results on the evaluation set:

  • Loss: 2.7114
  • Wer: 0.8908

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 30.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
No log 0.43 100 4.4129 1.0
No log 0.87 200 3.5927 1.0
No log 1.3 300 3.3780 1.0
No log 1.74 400 3.0830 1.0
5.3551 2.17 500 2.6278 0.9999
5.3551 2.61 600 1.8359 1.0000
5.3551 3.04 700 1.7878 0.9914
5.3551 3.48 800 1.5219 0.9875
5.3551 3.91 900 1.4348 0.9879
1.7199 4.35 1000 1.4354 0.9644
1.7199 4.78 1100 1.5210 0.9519
1.7199 5.22 1200 1.3607 0.9475
1.7199 5.65 1300 1.3839 0.9343
1.7199 6.09 1400 1.2806 0.8944
1.2342 6.52 1500 1.3036 0.9011
1.2342 6.95 1600 1.3704 0.9072
1.2342 7.39 1700 1.2981 0.8891
1.2342 7.82 1800 1.2786 0.8733
1.2342 8.26 1900 1.2897 0.8867
0.9831 8.69 2000 1.4436 0.8780
0.9831 9.13 2100 1.3680 0.8873
0.9831 9.56 2200 1.3471 0.8692
0.9831 10.0 2300 1.3725 0.8729
0.9831 10.43 2400 1.4439 0.8771
0.8071 10.87 2500 1.5114 0.8928
0.8071 11.3 2600 1.6156 0.8958
0.8071 11.74 2700 1.4381 0.8749
0.8071 12.17 2800 1.5088 0.8717
0.8071 12.61 2900 1.5486 0.8813
0.6321 13.04 3000 1.4536 0.8884
0.6321 13.48 3100 1.4679 0.8947
0.6321 13.91 3200 1.5628 0.9117
0.6321 14.35 3300 1.5831 0.8716
0.6321 14.78 3400 1.6733 0.8702
0.4998 15.22 3500 1.8225 0.8665
0.4998 15.65 3600 1.8558 0.8732
0.4998 16.09 3700 1.7513 0.8766
0.4998 16.52 3800 1.8562 0.8753
0.4998 16.95 3900 1.9018 0.8704
0.4421 17.39 4000 1.9341 0.8789
0.4421 17.82 4100 1.9582 0.8781
0.4421 18.26 4200 1.8863 0.8821
0.4421 18.69 4300 1.9366 0.8847
0.4421 19.13 4400 2.1902 0.8721
0.3712 19.56 4500 2.1641 0.8670
0.3712 20.0 4600 2.1639 0.8776
0.3712 20.43 4700 2.2695 0.9030
0.3712 20.87 4800 2.1909 0.8937
0.3712 21.3 4900 2.1606 0.8959
0.3067 21.74 5000 2.1756 0.8943
0.3067 22.17 5100 2.4092 0.8773
0.3067 22.61 5200 2.4991 0.8721
0.3067 23.04 5300 2.3340 0.8910
0.3067 23.48 5400 2.3567 0.8946
0.2764 23.91 5500 2.3215 0.8897
0.2764 24.35 5600 2.4824 0.9002
0.2764 24.78 5700 2.4585 0.8963
0.2764 25.22 5800 2.5804 0.8879
0.2764 25.65 5900 2.5814 0.8903
0.2593 26.09 6000 2.5374 0.8868
0.2593 26.52 6100 2.5346 0.8922
0.2593 26.95 6200 2.5465 0.8873
0.2593 27.39 6300 2.6002 0.8919
0.2593 27.82 6400 2.6102 0.8928
0.227 28.26 6500 2.6925 0.8914
0.227 28.69 6600 2.6981 0.8913
0.227 29.13 6700 2.6872 0.8891
0.227 29.56 6800 2.7015 0.8897
0.227 30.0 6900 2.7114 0.8908

Framework versions

  • Transformers 4.18.0.dev0
  • Pytorch 1.10.2+cu113
  • Datasets 1.18.4
  • Tokenizers 0.11.6
Downloads last month
30
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.