Edit model card

libri-alpha-0.75-Temp-1-attention-3-layers-distil-with-6-layers

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 176.0337
  • Wer: 0.4211

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.2
  • num_epochs: 30
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
818.3828 0.19 100 299.9131 0.6582
836.7686 0.37 200 297.5387 0.6554
787.5465 0.56 300 294.7996 0.6512
804.7234 0.75 400 292.2293 0.6463
812.2401 0.93 500 289.8000 0.6398
796.3071 1.12 600 287.0509 0.6351
755.1163 1.31 700 283.9792 0.6282
789.9813 1.49 800 280.8894 0.6241
746.3034 1.68 900 278.0494 0.6193
762.4461 1.87 1000 276.0318 0.6131
750.3801 2.06 1100 273.0303 0.6117
759.8897 2.24 1200 270.4828 0.6072
766.8752 2.43 1300 267.3370 0.6017
747.1767 2.62 1400 264.8568 0.5978
709.7859 2.8 1500 262.1042 0.5940
682.8446 2.99 1600 258.8314 0.5878
678.7934 3.18 1700 255.1844 0.5842
688.9065 3.36 1800 253.7106 0.5789
676.8735 3.55 1900 252.6307 0.5784
671.5541 3.74 2000 249.2319 0.5723
661.5926 3.92 2100 247.3242 0.5669
669.4624 4.11 2200 243.3662 0.5623
643.5416 4.3 2300 241.4045 0.5609
663.1716 4.49 2400 239.8308 0.5571
645.2939 4.67 2500 236.4551 0.5498
643.487 4.86 2600 234.4706 0.5463
661.5359 5.05 2700 233.3655 0.5420
620.0827 5.23 2800 230.9604 0.5410
623.4608 5.42 2900 229.2104 0.5346
614.5616 5.61 3000 227.1311 0.5315
610.5325 5.79 3100 224.7402 0.5308
609.3737 5.98 3200 224.0515 0.5268
577.5857 6.17 3300 222.1016 0.5211
600.8658 6.35 3400 220.7187 0.5190
579.8273 6.54 3500 219.0504 0.5167
577.1318 6.73 3600 216.5892 0.5138
598.4185 6.92 3700 214.8921 0.5092
574.2131 7.1 3800 214.3345 0.5069
558.845 7.29 3900 212.7756 0.5027
562.5259 7.48 4000 210.9550 0.5001
542.4756 7.66 4100 210.1631 0.4984
531.1471 7.85 4200 208.7503 0.4963
578.4951 8.04 4300 206.8714 0.4942
549.3659 8.22 4400 206.2005 0.4905
549.3715 8.41 4500 204.9378 0.4897
526.3116 8.6 4600 204.2942 0.4877
556.9892 8.78 4700 203.8197 0.4838
536.3161 8.97 4800 201.7986 0.4811
566.8901 9.16 4900 201.8660 0.4819
527.6058 9.35 5000 201.0037 0.4783
537.7548 9.53 5100 199.5784 0.4765
521.8008 9.72 5200 199.2429 0.4740
527.2388 9.91 5300 199.3108 0.4754
510.5078 10.09 5400 198.8395 0.4722
543.6402 10.28 5500 198.2866 0.4706
534.623 10.47 5600 197.0690 0.4686
504.127 10.65 5700 198.0624 0.4706
522.8051 10.84 5800 196.0252 0.4649
543.651 11.03 5900 193.4206 0.4655
515.7955 11.21 6000 193.9807 0.4640
504.2811 11.4 6100 193.1624 0.4613
506.7597 11.59 6200 193.1175 0.4607
496.2182 11.77 6300 191.9746 0.4607
524.1124 11.96 6400 192.0139 0.4571
502.1757 12.15 6500 191.7406 0.4561
517.2375 12.34 6600 190.3192 0.4552
507.2228 12.52 6700 189.7269 0.4559
495.0334 12.71 6800 189.9307 0.4541
487.9488 12.9 6900 187.9196 0.4506
481.8289 13.08 7000 188.2804 0.4495
497.7955 13.27 7100 188.5640 0.4495
495.9639 13.46 7200 188.2054 0.4482
477.9561 13.64 7300 187.2503 0.4490
489.6057 13.83 7400 186.3044 0.4460
486.4094 14.02 7500 185.7658 0.4444
490.18 14.21 7600 185.7803 0.4463
481.2339 14.39 7700 185.2366 0.4456
497.7487 14.58 7800 184.6401 0.4416
492.7409 14.77 7900 184.4930 0.4424
480.0516 14.95 8000 184.3564 0.4426
503.3515 15.14 8100 184.0443 0.4397
483.1878 15.33 8200 183.3429 0.4382
468.6728 15.51 8300 183.1123 0.4366
473.9079 15.7 8400 182.9552 0.4385
471.2554 15.89 8500 181.8883 0.4370
482.9162 16.07 8600 181.9493 0.4383
473.6775 16.26 8700 182.3769 0.4374
490.9736 16.45 8800 181.8944 0.4357
473.9841 16.63 8900 181.1866 0.4343
458.6111 16.82 9000 180.9161 0.4327
473.312 17.01 9100 180.8257 0.4328
474.3633 17.2 9200 180.2251 0.4310
466.0219 17.38 9300 180.4953 0.4328
456.4883 17.57 9400 180.3950 0.4321
473.9428 17.76 9500 180.0498 0.4297
467.2524 17.94 9600 179.7389 0.4317
453.7509 18.13 9700 179.1797 0.4283
480.6095 18.32 9800 178.6612 0.4292
476.2359 18.5 9900 178.4738 0.4277
468.8798 18.69 10000 178.2498 0.4277
469.7198 18.88 10100 178.3403 0.4268
456.8305 19.07 10200 177.9058 0.4258
459.8746 19.25 10300 177.8065 0.4256
461.4241 19.44 10400 177.2613 0.4244
463.9236 19.63 10500 177.2336 0.4267
432.5434 19.81 10600 177.2254 0.4262
440.5167 20.0 10700 177.1022 0.4242
455.6524 20.19 10800 177.0475 0.4243
463.6909 20.37 10900 176.5024 0.4229
460.1803 20.56 11000 176.3350 0.4198
459.226 20.75 11100 176.1349 0.4204
453.7939 20.93 11200 175.5934 0.4213
452.2502 21.12 11300 175.7082 0.4221
469.455 21.31 11400 175.9399 0.4222
460.2929 21.49 11500 176.0337 0.4211

Framework versions

  • Transformers 4.23.1
  • Pytorch 1.12.1
  • Datasets 2.6.1
  • Tokenizers 0.13.1
Downloads last month
1