Edit model card

1_9e-3_1_0.1

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8873
  • Accuracy: 0.7443

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.009
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
1.2619 1.0 590 0.6470 0.6217
1.0747 2.0 1180 0.6993 0.4211
0.8969 3.0 1770 0.6604 0.5719
0.8368 4.0 2360 0.7051 0.5043
0.8124 5.0 2950 0.7117 0.6294
0.7078 6.0 3540 0.6893 0.6557
0.6885 7.0 4130 1.0081 0.4541
0.648 8.0 4720 0.5951 0.6951
0.6353 9.0 5310 0.6077 0.6624
0.6037 10.0 5900 0.5867 0.6920
0.5823 11.0 6490 0.5554 0.7024
0.5648 12.0 7080 0.5959 0.6602
0.5628 13.0 7670 0.5532 0.6966
0.5323 14.0 8260 0.5416 0.7107
0.5218 15.0 8850 0.5633 0.6969
0.505 16.0 9440 0.5292 0.7110
0.4968 17.0 10030 0.5375 0.7235
0.4821 18.0 10620 0.6966 0.6667
0.4692 19.0 11210 0.5588 0.7254
0.4651 20.0 11800 0.5620 0.7177
0.4215 21.0 12390 0.5768 0.7306
0.4361 22.0 12980 0.5720 0.7278
0.4138 23.0 13570 0.6098 0.7321
0.3883 24.0 14160 0.5691 0.7315
0.3852 25.0 14750 0.5940 0.7315
0.3691 26.0 15340 0.7810 0.6657
0.3689 27.0 15930 0.6396 0.7220
0.3413 28.0 16520 0.6304 0.7385
0.3333 29.0 17110 0.6135 0.7343
0.3259 30.0 17700 0.6418 0.7242
0.3049 31.0 18290 0.6385 0.7327
0.3203 32.0 18880 0.7961 0.7275
0.2978 33.0 19470 0.6375 0.7260
0.2831 34.0 20060 0.7307 0.7116
0.2782 35.0 20650 0.7057 0.7422
0.2668 36.0 21240 0.6802 0.7391
0.2673 37.0 21830 0.7305 0.7260
0.2478 38.0 22420 0.7019 0.7367
0.2481 39.0 23010 0.7238 0.7465
0.2406 40.0 23600 0.8325 0.7300
0.2344 41.0 24190 0.8143 0.7367
0.2151 42.0 24780 0.8423 0.7413
0.226 43.0 25370 0.7901 0.7343
0.2141 44.0 25960 0.8760 0.7355
0.2062 45.0 26550 0.8387 0.7416
0.192 46.0 27140 0.7825 0.7413
0.2045 47.0 27730 0.8157 0.7211
0.1922 48.0 28320 0.8735 0.7190
0.1967 49.0 28910 0.7669 0.7416
0.1814 50.0 29500 0.7925 0.7401
0.1814 51.0 30090 0.8249 0.7367
0.1721 52.0 30680 0.8772 0.7352
0.1607 53.0 31270 0.8614 0.7355
0.162 54.0 31860 0.8165 0.7376
0.1745 55.0 32450 0.8330 0.7287
0.1644 56.0 33040 0.8343 0.7370
0.1478 57.0 33630 0.8965 0.7318
0.1571 58.0 34220 0.9214 0.7232
0.1506 59.0 34810 0.9052 0.7401
0.1469 60.0 35400 0.8536 0.7428
0.1472 61.0 35990 0.8885 0.7309
0.1408 62.0 36580 0.8733 0.7413
0.1356 63.0 37170 0.9329 0.7214
0.1445 64.0 37760 0.8954 0.7480
0.1398 65.0 38350 0.8575 0.7391
0.1389 66.0 38940 0.8679 0.7422
0.1278 67.0 39530 0.9074 0.7446
0.1337 68.0 40120 0.8901 0.7346
0.123 69.0 40710 0.9254 0.7453
0.1362 70.0 41300 0.8586 0.7388
0.1214 71.0 41890 0.9126 0.7321
0.1245 72.0 42480 0.8943 0.7394
0.1142 73.0 43070 0.9241 0.7349
0.1227 74.0 43660 0.9128 0.7391
0.1121 75.0 44250 0.8904 0.7373
0.1172 76.0 44840 0.9219 0.7404
0.1122 77.0 45430 0.9410 0.7486
0.1047 78.0 46020 0.8903 0.7379
0.1088 79.0 46610 0.9508 0.7330
0.1076 80.0 47200 0.8921 0.7416
0.0986 81.0 47790 0.8941 0.7327
0.1037 82.0 48380 0.9029 0.7343
0.0983 83.0 48970 0.8863 0.7370
0.104 84.0 49560 0.8850 0.7361
0.0996 85.0 50150 0.9146 0.7453
0.0994 86.0 50740 0.8958 0.7355
0.0905 87.0 51330 0.8989 0.7474
0.0953 88.0 51920 0.9067 0.7422
0.0952 89.0 52510 0.9108 0.7410
0.0947 90.0 53100 0.9015 0.7382
0.09 91.0 53690 0.8984 0.7431
0.0936 92.0 54280 0.8893 0.7339
0.0908 93.0 54870 0.8919 0.7367
0.0872 94.0 55460 0.9024 0.7450
0.0847 95.0 56050 0.9029 0.7364
0.0901 96.0 56640 0.9023 0.7385
0.085 97.0 57230 0.8978 0.7370
0.0852 98.0 57820 0.8812 0.7413
0.0887 99.0 58410 0.8885 0.7385
0.0855 100.0 59000 0.8873 0.7443

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
10

Dataset used to train Onutoa/1_9e-3_1_0.1