Edit model card

2_4e-3_1_0.1

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5293
  • Accuracy: 0.7272

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.004
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
0.8756 1.0 590 0.9984 0.6211
0.8309 2.0 1180 0.7494 0.6217
0.8162 3.0 1770 0.8910 0.3826
0.8025 4.0 2360 0.6504 0.6028
0.8059 5.0 2950 0.6535 0.5945
0.768 6.0 3540 0.6293 0.6291
0.7423 7.0 4130 0.9356 0.4339
0.7272 8.0 4720 0.7985 0.6220
0.7076 9.0 5310 0.6240 0.6541
0.6803 10.0 5900 0.6284 0.6639
0.6637 11.0 6490 0.6013 0.6691
0.6217 12.0 7080 0.5783 0.6725
0.6169 13.0 7670 0.5657 0.6841
0.5962 14.0 8260 0.6273 0.6618
0.5937 15.0 8850 0.5982 0.6725
0.5811 16.0 9440 0.6778 0.5997
0.5534 17.0 10030 0.5478 0.7028
0.5641 18.0 10620 0.5615 0.7034
0.5588 19.0 11210 0.5467 0.7076
0.5611 20.0 11800 0.5505 0.7058
0.5423 21.0 12390 0.5617 0.7086
0.5372 22.0 12980 0.5483 0.7003
0.5387 23.0 13570 0.5560 0.7113
0.5274 24.0 14160 0.5278 0.7131
0.5242 25.0 14750 0.5377 0.7150
0.5256 26.0 15340 0.5796 0.6856
0.5203 27.0 15930 0.5456 0.6976
0.5087 28.0 16520 0.5365 0.7199
0.5127 29.0 17110 0.5419 0.7049
0.5005 30.0 17700 0.5417 0.7257
0.5008 31.0 18290 0.5257 0.7116
0.4959 32.0 18880 0.5463 0.7232
0.4931 33.0 19470 0.5251 0.7260
0.4849 34.0 20060 0.5282 0.7217
0.4733 35.0 20650 0.5296 0.7199
0.4842 36.0 21240 0.5230 0.7229
0.4811 37.0 21830 0.5264 0.7232
0.4683 38.0 22420 0.5518 0.7058
0.4692 39.0 23010 0.5256 0.7300
0.4621 40.0 23600 0.5292 0.7303
0.4624 41.0 24190 0.5467 0.7110
0.4618 42.0 24780 0.5189 0.7324
0.465 43.0 25370 0.5285 0.7330
0.453 44.0 25960 0.5577 0.7113
0.4533 45.0 26550 0.5170 0.7343
0.4524 46.0 27140 0.5219 0.7223
0.4454 47.0 27730 0.5367 0.7257
0.4401 48.0 28320 0.5251 0.7339
0.4547 49.0 28910 0.5300 0.7254
0.4374 50.0 29500 0.5318 0.7278
0.444 51.0 30090 0.5317 0.7239
0.4363 52.0 30680 0.5309 0.7306
0.4381 53.0 31270 0.5206 0.7312
0.4314 54.0 31860 0.5283 0.7269
0.4334 55.0 32450 0.5254 0.7278
0.43 56.0 33040 0.5317 0.7278
0.4194 57.0 33630 0.5261 0.7272
0.4341 58.0 34220 0.5266 0.7300
0.4243 59.0 34810 0.5269 0.7275
0.4191 60.0 35400 0.5293 0.7272

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
11

Dataset used to train Onutoa/2_4e-3_1_0.1