Edit model card

Glue_distilbert

This model is a fine-tuned version of distilbert-base-uncased on the GLUE MRPC dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1042
  • Accuracy: 0.8505
  • F1: 0.8961
  • Combined Score: 0.8733

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 33
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 50
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Accuracy F1 Combined Score
0.5066 1.0 115 0.3833 0.8358 0.8851 0.8604
0.3227 2.0 230 0.4336 0.8309 0.8844 0.8577
0.1764 3.0 345 0.4943 0.8309 0.8757 0.8533
0.0792 4.0 460 0.7271 0.8431 0.8861 0.8646
0.058 5.0 575 0.8374 0.8456 0.8945 0.8700
0.0594 6.0 690 0.7570 0.8309 0.8816 0.8563
0.0395 7.0 805 0.8640 0.8431 0.8897 0.8664
0.03 8.0 920 0.9007 0.8260 0.8799 0.8529
0.0283 9.0 1035 0.9479 0.8211 0.8685 0.8448
0.0127 10.0 1150 1.0686 0.8431 0.8915 0.8673
0.0097 11.0 1265 1.0752 0.8431 0.8919 0.8675
0.0164 12.0 1380 1.0627 0.8284 0.8801 0.8543
0.007 13.0 1495 1.1466 0.8333 0.8815 0.8574
0.0132 14.0 1610 1.1442 0.8456 0.8938 0.8697
0.0125 15.0 1725 1.1716 0.8235 0.8771 0.8503
0.0174 16.0 1840 1.1187 0.8333 0.8790 0.8562
0.0171 17.0 1955 1.1053 0.8456 0.8938 0.8697
0.0026 18.0 2070 1.2011 0.8309 0.8787 0.8548
0.0056 19.0 2185 1.3085 0.8260 0.8748 0.8504
0.0067 20.0 2300 1.3042 0.8333 0.8803 0.8568
0.0129 21.0 2415 1.1042 0.8505 0.8961 0.8733
0.0149 22.0 2530 1.1575 0.8235 0.8820 0.8527
0.0045 23.0 2645 1.2359 0.8407 0.8900 0.8654
0.0029 24.0 2760 1.3823 0.8211 0.8744 0.8477
0.0074 25.0 2875 1.2394 0.8431 0.8904 0.8668
0.002 26.0 2990 1.4450 0.8333 0.8859 0.8596
0.0039 27.0 3105 1.5102 0.8284 0.8805 0.8545
0.0015 28.0 3220 1.4767 0.8431 0.8915 0.8673
0.0062 29.0 3335 1.5101 0.8407 0.8926 0.8666
0.0054 30.0 3450 1.3861 0.8382 0.8893 0.8637
0.0001 31.0 3565 1.4101 0.8456 0.8948 0.8702
0.0 32.0 3680 1.4203 0.8480 0.8963 0.8722
0.002 33.0 3795 1.4526 0.8431 0.8923 0.8677
0.0019 34.0 3910 1.6265 0.8260 0.8842 0.8551
0.0029 35.0 4025 1.4788 0.8456 0.8945 0.8700
0.0 36.0 4140 1.4668 0.8480 0.8956 0.8718
0.0007 37.0 4255 1.5248 0.8456 0.8945 0.8700
0.0 38.0 4370 1.5202 0.8480 0.8960 0.8720
0.0033 39.0 4485 1.5541 0.8358 0.8878 0.8618
0.0017 40.0 4600 1.5097 0.8407 0.8904 0.8655
0.0 41.0 4715 1.5301 0.8407 0.8904 0.8655
0.0 42.0 4830 1.4974 0.8407 0.8862 0.8634
0.0018 43.0 4945 1.5483 0.8382 0.8896 0.8639
0.0 44.0 5060 1.5071 0.8480 0.8931 0.8706
0.0 45.0 5175 1.5104 0.8480 0.8935 0.8708
0.0011 46.0 5290 1.5445 0.8382 0.8896 0.8639
0.0012 47.0 5405 1.5378 0.8431 0.8900 0.8666
0.0 48.0 5520 1.5577 0.8407 0.8881 0.8644
0.0009 49.0 5635 1.5431 0.8407 0.8885 0.8646
0.0002 50.0 5750 1.5383 0.8431 0.8904 0.8668

Framework versions

  • Transformers 4.25.1
  • Pytorch 1.13.0+cu116
  • Datasets 2.8.0
  • Tokenizers 0.13.2
Downloads last month
8

Dataset used to train gokuls/Glue_distilbert

Evaluation results