Edit model card

2_6e-3_1_0.1

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5859
  • Accuracy: 0.7254

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.006
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
1.0992 1.0 590 1.0242 0.3783
0.8881 2.0 1180 0.9820 0.3817
0.8638 3.0 1770 0.9819 0.3783
0.8712 4.0 2360 0.8440 0.3789
0.8299 5.0 2950 0.7281 0.6217
0.8746 6.0 3540 0.6816 0.6049
0.9153 7.0 4130 0.6879 0.5281
0.8459 8.0 4720 0.6251 0.6333
0.7986 9.0 5310 1.0586 0.6217
0.8116 10.0 5900 0.6938 0.6434
0.789 11.0 6490 0.7268 0.6511
0.7792 12.0 7080 0.6182 0.6593
0.7814 13.0 7670 1.2212 0.4502
0.7899 14.0 8260 0.6923 0.6621
0.7264 15.0 8850 0.6417 0.6706
0.7226 16.0 9440 0.7098 0.5881
0.7009 17.0 10030 0.5964 0.6673
0.7149 18.0 10620 0.7206 0.6141
0.6615 19.0 11210 0.6004 0.6850
0.6847 20.0 11800 0.9306 0.6575
0.6563 21.0 12390 0.7185 0.6823
0.643 22.0 12980 0.6512 0.6502
0.6407 23.0 13570 0.6875 0.6832
0.6207 24.0 14160 0.6471 0.6593
0.5944 25.0 14750 0.6547 0.7080
0.6082 26.0 15340 0.6463 0.6532
0.6005 27.0 15930 0.5753 0.7018
0.5711 28.0 16520 0.5725 0.7119
0.5729 29.0 17110 0.5858 0.7223
0.556 30.0 17700 0.5890 0.7245
0.5549 31.0 18290 0.5599 0.7138
0.5355 32.0 18880 0.7710 0.6945
0.5358 33.0 19470 0.5839 0.7144
0.503 34.0 20060 0.6080 0.7324
0.5149 35.0 20650 0.6178 0.7107
0.5099 36.0 21240 0.5268 0.7275
0.5114 37.0 21830 0.5852 0.7269
0.4823 38.0 22420 0.5647 0.7229
0.4736 39.0 23010 0.6011 0.7339
0.4757 40.0 23600 0.7783 0.7208
0.4761 41.0 24190 0.5780 0.7294
0.464 42.0 24780 0.6204 0.7312
0.4545 43.0 25370 0.5590 0.7214
0.45 44.0 25960 0.6851 0.7156
0.4424 45.0 26550 0.6311 0.7095
0.4276 46.0 27140 0.5536 0.7211
0.4401 47.0 27730 0.5773 0.7269
0.4319 48.0 28320 0.5876 0.7269
0.4211 49.0 28910 0.5829 0.7312
0.4126 50.0 29500 0.6142 0.7232
0.4183 51.0 30090 0.5985 0.7251
0.4045 52.0 30680 0.6185 0.7211
0.4058 53.0 31270 0.6073 0.7336
0.402 54.0 31860 0.6035 0.7232
0.4031 55.0 32450 0.6014 0.7284
0.3964 56.0 33040 0.5933 0.7300
0.3932 57.0 33630 0.5683 0.7263
0.3954 58.0 34220 0.5942 0.7254
0.3898 59.0 34810 0.5832 0.7294
0.3842 60.0 35400 0.5859 0.7254

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
9

Dataset used to train Onutoa/2_6e-3_1_0.1