Edit model card

1_7e-3_5_0.1

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9635
  • Accuracy: 0.7382

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.007
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
1.4151 1.0 590 1.1624 0.6217
1.4557 2.0 1180 0.9521 0.4489
1.2723 3.0 1770 3.3480 0.3795
1.1923 4.0 2360 1.0321 0.4761
1.2283 5.0 2950 1.7063 0.6217
1.0486 6.0 3540 0.8079 0.6566
0.983 7.0 4130 2.7141 0.4119
1.061 8.0 4720 1.2305 0.6407
0.9617 9.0 5310 0.9103 0.6654
0.9218 10.0 5900 1.0764 0.5728
0.8804 11.0 6490 0.7290 0.7034
0.8314 12.0 7080 0.7770 0.7080
0.7805 13.0 7670 0.7321 0.7165
0.7474 14.0 8260 0.7924 0.6667
0.7693 15.0 8850 0.8842 0.7150
0.7532 16.0 9440 0.6981 0.7174
0.6803 17.0 10030 1.2782 0.6064
0.6888 18.0 10620 0.9639 0.7061
0.6432 19.0 11210 0.8320 0.7174
0.6091 20.0 11800 0.8192 0.7144
0.5904 21.0 12390 1.0849 0.7089
0.5754 22.0 12980 0.8291 0.6823
0.539 23.0 13570 1.1292 0.7128
0.525 24.0 14160 0.8724 0.6942
0.5346 25.0 14750 0.8999 0.7067
0.5164 26.0 15340 1.5764 0.5832
0.4874 27.0 15930 1.1817 0.6581
0.439 28.0 16520 1.0572 0.6719
0.4388 29.0 17110 0.9059 0.7376
0.4096 30.0 17700 0.8708 0.7028
0.4117 31.0 18290 0.9059 0.7379
0.401 32.0 18880 0.8226 0.7303
0.3763 33.0 19470 0.8717 0.7248
0.3629 34.0 20060 0.9393 0.7046
0.33 35.0 20650 0.8766 0.7248
0.3598 36.0 21240 1.0561 0.7315
0.3211 37.0 21830 0.9181 0.7021
0.3146 38.0 22420 0.8177 0.7303
0.322 39.0 23010 0.9637 0.7336
0.2963 40.0 23600 1.0769 0.7128
0.3265 41.0 24190 1.0980 0.7330
0.276 42.0 24780 0.8939 0.7422
0.2953 43.0 25370 1.0178 0.7303
0.2669 44.0 25960 1.0061 0.7150
0.2613 45.0 26550 1.0087 0.7076
0.257 46.0 27140 0.8887 0.7122
0.2586 47.0 27730 1.0173 0.7327
0.2492 48.0 28320 1.0005 0.7324
0.2572 49.0 28910 0.9586 0.7226
0.2388 50.0 29500 0.9336 0.7318
0.218 51.0 30090 1.0072 0.7220
0.2353 52.0 30680 0.8747 0.7343
0.2252 53.0 31270 0.9927 0.7361
0.2239 54.0 31860 0.9873 0.7281
0.2289 55.0 32450 1.0668 0.7098
0.2108 56.0 33040 0.8821 0.7306
0.197 57.0 33630 0.9667 0.7287
0.2045 58.0 34220 0.8937 0.7294
0.2092 59.0 34810 1.1175 0.7110
0.2115 60.0 35400 1.0294 0.7330
0.2051 61.0 35990 0.9363 0.7349
0.1947 62.0 36580 0.9427 0.7278
0.1918 63.0 37170 1.0344 0.7226
0.1911 64.0 37760 0.9883 0.7324
0.1875 65.0 38350 0.9878 0.7281
0.181 66.0 38940 1.0037 0.7306
0.1844 67.0 39530 1.0300 0.7309
0.172 68.0 40120 0.9785 0.7275
0.1728 69.0 40710 1.0590 0.7413
0.1756 70.0 41300 0.9992 0.7248
0.1671 71.0 41890 1.0583 0.7061
0.1824 72.0 42480 1.0114 0.7361
0.1638 73.0 43070 0.9866 0.7266
0.159 74.0 43660 1.0436 0.7242
0.168 75.0 44250 1.0963 0.7364
0.1637 76.0 44840 0.9260 0.7300
0.1583 77.0 45430 0.9472 0.7309
0.161 78.0 46020 0.9540 0.7300
0.1485 79.0 46610 0.9537 0.7294
0.1566 80.0 47200 1.0064 0.7248
0.1499 81.0 47790 0.9961 0.7358
0.1529 82.0 48380 0.9872 0.7410
0.1545 83.0 48970 1.0003 0.7309
0.1481 84.0 49560 0.9471 0.7349
0.1492 85.0 50150 0.9946 0.7235
0.1402 86.0 50740 1.0070 0.7394
0.1437 87.0 51330 0.9976 0.7379
0.1368 88.0 51920 0.9900 0.7355
0.1394 89.0 52510 1.0081 0.7333
0.1376 90.0 53100 0.9910 0.7349
0.1402 91.0 53690 0.9569 0.7358
0.1397 92.0 54280 0.9660 0.7346
0.1311 93.0 54870 0.9787 0.7291
0.1389 94.0 55460 0.9653 0.7343
0.1315 95.0 56050 0.9494 0.7346
0.1301 96.0 56640 0.9705 0.7333
0.133 97.0 57230 0.9615 0.7355
0.1293 98.0 57820 0.9686 0.7312
0.1332 99.0 58410 0.9759 0.7346
0.1306 100.0 59000 0.9635 0.7382

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
4

Dataset used to train Onutoa/1_7e-3_5_0.1