Edit model card

2_1e-2_10_0.9

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8066
  • Accuracy: 0.7550

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.01
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
4.8797 1.0 590 3.6447 0.6217
4.0161 2.0 1180 3.3996 0.6291
3.6981 3.0 1770 2.6674 0.6297
3.2078 4.0 2360 2.9843 0.5676
2.8064 5.0 2950 2.1131 0.6804
2.4341 6.0 3540 3.3843 0.6673
2.403 7.0 4130 1.8655 0.7043
2.3212 8.0 4720 1.8492 0.7055
2.2831 9.0 5310 1.5678 0.7024
2.1715 10.0 5900 1.6676 0.7193
2.0967 11.0 6490 1.5610 0.7174
1.9909 12.0 7080 1.3225 0.7122
1.9391 13.0 7670 1.3815 0.7180
1.852 14.0 8260 1.4632 0.7260
1.8568 15.0 8850 1.3623 0.7101
1.776 16.0 9440 1.3193 0.7015
1.6984 17.0 10030 1.3270 0.7208
1.6811 18.0 10620 1.3129 0.7055
1.6857 19.0 11210 1.3154 0.7382
1.6594 20.0 11800 1.2337 0.7352
1.5595 21.0 12390 1.2297 0.7404
1.6112 22.0 12980 1.1512 0.7450
1.5746 23.0 13570 1.1148 0.7208
1.5216 24.0 14160 1.1788 0.7373
1.5245 25.0 14750 1.0049 0.7361
1.4803 26.0 15340 1.5312 0.6890
1.5122 27.0 15930 1.0611 0.7187
1.4459 28.0 16520 1.5559 0.7431
1.4638 29.0 17110 1.3813 0.7450
1.3627 30.0 17700 1.0913 0.7456
1.3834 31.0 18290 1.1301 0.7113
1.3657 32.0 18880 1.2116 0.7560
1.373 33.0 19470 1.0198 0.7339
1.3113 34.0 20060 1.1041 0.7563
1.3327 35.0 20650 0.9885 0.7446
1.3544 36.0 21240 1.2174 0.7508
1.3198 37.0 21830 1.0094 0.7498
1.3 38.0 22420 0.9895 0.7306
1.2688 39.0 23010 1.0118 0.7471
1.3101 40.0 23600 1.1384 0.7517
1.2849 41.0 24190 1.1154 0.7520
1.2455 42.0 24780 0.9685 0.7431
1.2155 43.0 25370 1.0038 0.7498
1.2078 44.0 25960 0.9498 0.7382
1.2362 45.0 26550 0.9510 0.7413
1.2271 46.0 27140 0.9461 0.7514
1.2351 47.0 27730 0.9943 0.7272
1.2383 48.0 28320 0.9020 0.7422
1.1625 49.0 28910 0.9276 0.7385
1.1711 50.0 29500 0.9250 0.7352
1.1454 51.0 30090 0.9967 0.7483
1.1319 52.0 30680 0.9347 0.7309
1.1622 53.0 31270 0.9274 0.7456
1.1189 54.0 31860 1.0497 0.7483
1.1265 55.0 32450 0.9079 0.7462
1.0948 56.0 33040 0.9022 0.7477
1.0921 57.0 33630 0.8855 0.7385
1.0819 58.0 34220 0.8766 0.7327
1.0894 59.0 34810 0.8820 0.7462
1.0512 60.0 35400 0.8711 0.7428
1.075 61.0 35990 0.8970 0.7336
1.0505 62.0 36580 0.8912 0.7401
1.0612 63.0 37170 0.8774 0.7428
1.0458 64.0 37760 0.8675 0.7532
1.043 65.0 38350 1.0193 0.7554
1.1037 66.0 38940 0.8751 0.7367
1.0246 67.0 39530 0.8489 0.7514
1.0428 68.0 40120 0.8590 0.7373
1.0486 69.0 40710 0.8615 0.7514
1.0103 70.0 41300 0.9673 0.7596
1.0363 71.0 41890 0.8328 0.7440
1.0077 72.0 42480 0.8548 0.7489
1.0046 73.0 43070 0.9124 0.7407
0.9814 74.0 43660 0.8423 0.7508
0.9962 75.0 44250 1.0146 0.7532
0.9867 76.0 44840 0.8612 0.7517
0.9623 77.0 45430 0.8438 0.7563
0.9448 78.0 46020 0.8514 0.7505
0.961 79.0 46610 0.9149 0.7566
0.9521 80.0 47200 0.8576 0.7560
0.9835 81.0 47790 0.8314 0.7498
0.9777 82.0 48380 0.8524 0.7572
0.9259 83.0 48970 0.8440 0.7529
0.9246 84.0 49560 0.8429 0.7557
0.9222 85.0 50150 0.8880 0.7563
0.9152 86.0 50740 0.8348 0.7587
0.9218 87.0 51330 0.8254 0.7538
0.9379 88.0 51920 0.8099 0.7514
0.9387 89.0 52510 0.8407 0.7575
0.9154 90.0 53100 0.8735 0.7575
0.9331 91.0 53690 0.8920 0.7593
0.892 92.0 54280 0.8117 0.7566
0.9002 93.0 54870 0.8450 0.7569
0.9134 94.0 55460 0.7989 0.7569
0.8965 95.0 56050 0.8088 0.7541
0.8834 96.0 56640 0.8058 0.7529
0.9075 97.0 57230 0.8254 0.7557
0.8821 98.0 57820 0.8172 0.7547
0.9119 99.0 58410 0.8069 0.7550
0.9082 100.0 59000 0.8066 0.7550

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
13

Dataset used to train Onutoa/2_1e-2_10_0.9