Edit model card

1_5e-3_5_0.9

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0161
  • Accuracy: 0.7254

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.005
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
3.6534 1.0 590 2.9136 0.6217
3.1534 2.0 1180 2.7899 0.5896
3.1737 3.0 1770 4.1075 0.4003
3.108 4.0 2360 2.7570 0.6263
2.796 5.0 2950 2.8853 0.6122
2.6961 6.0 3540 2.6092 0.6083
2.7012 7.0 4130 5.4272 0.3899
2.5904 8.0 4720 2.6163 0.6110
2.6187 9.0 5310 2.4947 0.6440
2.4748 10.0 5900 3.1599 0.6343
2.4977 11.0 6490 2.4600 0.6358
2.4255 12.0 7080 2.3595 0.6165
2.371 13.0 7670 2.2762 0.6505
2.3482 14.0 8260 2.3764 0.6572
2.3162 15.0 8850 2.1363 0.6489
2.1908 16.0 9440 3.3056 0.6407
2.0964 17.0 10030 2.3744 0.6489
2.063 18.0 10620 2.3019 0.6021
2.0119 19.0 11210 2.0892 0.6734
2.0429 20.0 11800 2.5552 0.6596
1.9324 21.0 12390 2.0537 0.6694
1.9379 22.0 12980 1.9183 0.6801
1.9294 23.0 13570 1.8407 0.6774
1.8366 24.0 14160 1.8770 0.6872
1.809 25.0 14750 2.0356 0.6761
1.804 26.0 15340 1.6646 0.6801
1.8059 27.0 15930 1.6864 0.6780
1.7665 28.0 16520 1.6191 0.6813
1.7034 29.0 17110 1.8237 0.6477
1.663 30.0 17700 1.5530 0.6911
1.619 31.0 18290 1.5786 0.6884
1.5861 32.0 18880 2.2685 0.6746
1.5504 33.0 19470 1.6077 0.6624
1.5419 34.0 20060 1.4337 0.6976
1.5614 35.0 20650 1.5165 0.6969
1.5039 36.0 21240 1.8150 0.6972
1.4848 37.0 21830 1.3947 0.7006
1.4697 38.0 22420 1.5730 0.6709
1.3728 39.0 23010 1.5815 0.7021
1.4163 40.0 23600 1.3688 0.7125
1.3908 41.0 24190 1.5884 0.7006
1.3566 42.0 24780 1.3154 0.7180
1.3155 43.0 25370 1.2954 0.7138
1.3059 44.0 25960 1.2546 0.7116
1.2942 45.0 26550 1.4254 0.7092
1.2492 46.0 27140 1.2366 0.7180
1.2493 47.0 27730 1.2187 0.7095
1.202 48.0 28320 1.2318 0.7183
1.2327 49.0 28910 1.4508 0.7083
1.215 50.0 29500 1.2490 0.7205
1.1485 51.0 30090 1.3040 0.7147
1.157 52.0 30680 1.1436 0.7180
1.1302 53.0 31270 1.1814 0.7147
1.1111 54.0 31860 1.3464 0.7150
1.1422 55.0 32450 1.3631 0.7144
1.0891 56.0 33040 1.1418 0.7214
1.0652 57.0 33630 1.2196 0.7202
1.0556 58.0 34220 1.2335 0.7235
1.0672 59.0 34810 1.1583 0.7128
1.0613 60.0 35400 1.1927 0.7061
1.0069 61.0 35990 1.0860 0.7226
1.0483 62.0 36580 1.1060 0.7245
1.0051 63.0 37170 1.1095 0.7150
0.9834 64.0 37760 1.0793 0.7196
0.9801 65.0 38350 1.1033 0.7196
0.9647 66.0 38940 1.0704 0.7214
0.9384 67.0 39530 1.0795 0.7196
0.9791 68.0 40120 1.1657 0.7245
0.9309 69.0 40710 1.1983 0.7263
0.9602 70.0 41300 1.1575 0.7284
0.9462 71.0 41890 1.0949 0.7165
0.9473 72.0 42480 1.1855 0.7266
0.9047 73.0 43070 1.1378 0.7266
0.8996 74.0 43660 1.0339 0.7226
0.9248 75.0 44250 1.1656 0.7309
0.9075 76.0 44840 1.0272 0.7208
0.9062 77.0 45430 1.1646 0.7327
0.8987 78.0 46020 1.0606 0.7202
0.8831 79.0 46610 1.0543 0.7291
0.8655 80.0 47200 1.0785 0.7312
0.8629 81.0 47790 1.0745 0.7284
0.8733 82.0 48380 1.0734 0.7242
0.8796 83.0 48970 1.0343 0.7266
0.8313 84.0 49560 1.0709 0.7294
0.835 85.0 50150 1.0230 0.7266
0.8425 86.0 50740 1.0049 0.7235
0.8486 87.0 51330 1.0971 0.7278
0.8361 88.0 51920 1.0212 0.7226
0.8171 89.0 52510 1.1451 0.7287
0.7994 90.0 53100 1.0329 0.7315
0.8268 91.0 53690 1.0968 0.7346
0.8289 92.0 54280 1.0031 0.7223
0.8082 93.0 54870 1.0499 0.7278
0.8188 94.0 55460 1.0121 0.7235
0.82 95.0 56050 1.0232 0.7242
0.8028 96.0 56640 1.0279 0.7229
0.7891 97.0 57230 1.0091 0.7260
0.7771 98.0 57820 1.0230 0.7248
0.7652 99.0 58410 1.0248 0.7257
0.7874 100.0 59000 1.0161 0.7254

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
10

Dataset used to train Onutoa/1_5e-3_5_0.9