Edit model card

model_v1_complete_training_wt_init_48_tiny

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 3.6497
  • Accuracy: 0.3896

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 10
  • distributed_type: multi-GPU
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 10000
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Accuracy
6.0224 0.33 30000 5.9447 0.1517
5.1853 0.66 60000 4.9635 0.2615
4.9483 0.98 90000 4.7016 0.2830
4.7679 1.31 120000 4.5154 0.2992
4.6448 1.64 150000 4.3884 0.3100
4.5688 1.97 180000 4.3095 0.3175
4.5102 2.29 210000 4.2511 0.3236
4.4662 2.62 240000 4.2038 0.3294
4.4269 2.95 270000 4.1677 0.3336
4.3982 3.28 300000 4.1367 0.3370
4.3714 3.6 330000 4.1103 0.3399
4.3493 3.93 360000 4.0869 0.3423
4.3303 4.26 390000 4.0680 0.3439
4.3131 4.59 420000 4.0467 0.3461
4.2875 4.92 450000 4.0292 0.3477
4.2629 5.24 480000 4.0109 0.3497
4.2413 5.57 510000 3.9931 0.3515
4.2282 5.9 540000 3.9759 0.3536
4.2003 6.23 570000 3.9608 0.3551
4.1867 6.55 600000 3.9445 0.3571
4.1607 6.88 630000 3.9273 0.3590
4.1511 7.21 660000 3.9130 0.3606
4.1335 7.54 690000 3.8971 0.3622
4.1158 7.87 720000 3.8798 0.3642
4.097 8.19 750000 3.8635 0.3663
4.0831 8.52 780000 3.8494 0.3679
4.0756 8.85 810000 3.8334 0.3696
4.0533 9.18 840000 3.8201 0.3712
4.0517 9.5 870000 3.8080 0.3724
4.0325 9.83 900000 3.7975 0.3734
4.0142 10.16 930000 3.7872 0.3748
4.0124 10.49 960000 3.7788 0.3759
4.0076 10.81 990000 3.7679 0.3767
3.9919 11.14 1020000 3.7609 0.3775
3.9888 11.47 1050000 3.7550 0.3783
3.9796 11.8 1080000 3.7481 0.3789
3.9742 12.13 1110000 3.7414 0.3796
3.9667 12.45 1140000 3.7370 0.3802
3.9652 12.78 1170000 3.7289 0.3810
3.9548 13.11 1200000 3.7278 0.3812
3.9556 13.44 1230000 3.7213 0.3817
3.9444 13.76 1260000 3.7152 0.3825
3.9428 14.09 1290000 3.7120 0.3827
3.9424 14.42 1320000 3.7072 0.3834
3.9389 14.75 1350000 3.7047 0.3836
3.936 15.07 1380000 3.6998 0.3844
3.9246 15.4 1410000 3.6968 0.3847
3.9281 15.73 1440000 3.6925 0.3851
3.9177 16.06 1470000 3.6916 0.3849
3.9216 16.39 1500000 3.6870 0.3855
3.9141 16.71 1530000 3.6822 0.3863
3.9154 17.04 1560000 3.6804 0.3864
3.9145 17.37 1590000 3.6795 0.3863
3.9103 17.7 1620000 3.6734 0.3869
3.9079 18.02 1650000 3.6724 0.3873
3.901 18.35 1680000 3.6707 0.3872
3.9015 18.68 1710000 3.6695 0.3873
3.8987 19.01 1740000 3.6672 0.3877
3.8929 19.33 1770000 3.6647 0.3878
3.892 19.66 1800000 3.6609 0.3884
3.8906 19.99 1830000 3.6595 0.3886
3.8923 20.32 1860000 3.6594 0.3885
3.8901 20.65 1890000 3.6541 0.3893
3.8853 20.97 1920000 3.6539 0.3891
3.8808 21.3 1950000 3.6527 0.3894
3.8835 21.63 1980000 3.6497 0.3896

Framework versions

  • Transformers 4.30.2
  • Pytorch 1.14.0a0+410ce96
  • Datasets 2.13.0
  • Tokenizers 0.13.3
Downloads last month
3