Edit model card

1_7e-3_10_0.1

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9819
  • Accuracy: 0.7303

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.007
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
1.4686 1.0 590 2.1510 0.3798
1.4409 2.0 1180 1.6620 0.6214
1.3336 3.0 1770 2.9692 0.3789
1.3331 4.0 2360 0.9502 0.6306
1.1121 5.0 2950 1.0075 0.6294
1.1211 6.0 3540 0.8872 0.6612
1.0596 7.0 4130 2.2995 0.4128
0.9931 8.0 4720 0.9438 0.6810
0.9235 9.0 5310 0.8872 0.6581
0.9613 10.0 5900 1.2425 0.5847
0.9177 11.0 6490 0.8943 0.6862
0.7985 12.0 7080 0.8038 0.6884
0.7943 13.0 7670 0.8016 0.6924
0.7742 14.0 8260 0.7611 0.7162
0.7373 15.0 8850 0.8728 0.7128
0.7054 16.0 9440 0.7415 0.7116
0.6589 17.0 10030 0.7437 0.7070
0.6449 18.0 10620 1.1703 0.6303
0.5872 19.0 11210 0.7583 0.7217
0.6065 20.0 11800 0.8280 0.7196
0.5721 21.0 12390 0.8555 0.7012
0.5955 22.0 12980 0.8109 0.7147
0.5202 23.0 13570 0.7935 0.7245
0.5017 24.0 14160 0.8676 0.6976
0.4923 25.0 14750 0.9052 0.7346
0.4774 26.0 15340 1.5937 0.5976
0.4714 27.0 15930 0.8523 0.7220
0.4439 28.0 16520 0.8909 0.7278
0.4227 29.0 17110 0.9224 0.7321
0.4029 30.0 17700 0.8559 0.7245
0.4015 31.0 18290 0.9032 0.7309
0.3923 32.0 18880 0.9003 0.7327
0.3897 33.0 19470 0.9786 0.6966
0.354 34.0 20060 0.8606 0.7251
0.3508 35.0 20650 0.8788 0.7278
0.3293 36.0 21240 1.1236 0.7214
0.3336 37.0 21830 0.9196 0.7266
0.3407 38.0 22420 0.9319 0.7220
0.3338 39.0 23010 0.8982 0.7321
0.3065 40.0 23600 0.9969 0.7333
0.2972 41.0 24190 1.0879 0.7309
0.2904 42.0 24780 0.9547 0.7327
0.2883 43.0 25370 0.9553 0.7187
0.2889 44.0 25960 0.9805 0.7251
0.269 45.0 26550 0.9516 0.7321
0.2573 46.0 27140 0.9094 0.7242
0.2679 47.0 27730 0.9398 0.7217
0.2595 48.0 28320 1.0380 0.7064
0.2819 49.0 28910 0.9346 0.7324
0.247 50.0 29500 0.9272 0.7239
0.2482 51.0 30090 0.9673 0.7254
0.242 52.0 30680 1.0115 0.7217
0.2343 53.0 31270 0.9958 0.7226
0.2381 54.0 31860 0.9392 0.7263
0.2279 55.0 32450 0.9564 0.7284
0.2256 56.0 33040 1.0298 0.7239
0.2267 57.0 33630 1.0001 0.7263
0.2161 58.0 34220 0.9867 0.7248
0.214 59.0 34810 0.9574 0.7226
0.2148 60.0 35400 1.0306 0.7229
0.2128 61.0 35990 1.0751 0.7346
0.2081 62.0 36580 0.9656 0.7263
0.203 63.0 37170 1.0100 0.7263
0.204 64.0 37760 0.9536 0.7297
0.1988 65.0 38350 0.9686 0.7269
0.1976 66.0 38940 0.9927 0.7297
0.1943 67.0 39530 0.9987 0.7309
0.1941 68.0 40120 0.9876 0.7309
0.1862 69.0 40710 0.9646 0.7321
0.1986 70.0 41300 1.0332 0.7324
0.1872 71.0 41890 0.9861 0.7324
0.1898 72.0 42480 0.9831 0.7346
0.1793 73.0 43070 0.9901 0.7303
0.1843 74.0 43660 1.0411 0.7294
0.1757 75.0 44250 1.0355 0.7312
0.1814 76.0 44840 1.0320 0.7239
0.1764 77.0 45430 0.9895 0.7333
0.1779 78.0 46020 0.9944 0.7367
0.1752 79.0 46610 0.9581 0.7263
0.1734 80.0 47200 0.9525 0.7297
0.1718 81.0 47790 0.9693 0.7275
0.1722 82.0 48380 0.9876 0.7297
0.1719 83.0 48970 0.9838 0.7306
0.161 84.0 49560 0.9996 0.7281
0.1711 85.0 50150 0.9880 0.7291
0.1634 86.0 50740 1.0062 0.7306
0.1587 87.0 51330 1.0071 0.7318
0.156 88.0 51920 1.0271 0.7297
0.1574 89.0 52510 1.0062 0.7321
0.151 90.0 53100 0.9889 0.7263
0.1553 91.0 53690 0.9676 0.7324
0.1584 92.0 54280 0.9721 0.7321
0.1491 93.0 54870 0.9824 0.7349
0.1523 94.0 55460 0.9880 0.7306
0.1509 95.0 56050 0.9993 0.7327
0.1496 96.0 56640 0.9892 0.7318
0.1518 97.0 57230 0.9925 0.7339
0.149 98.0 57820 0.9845 0.7333
0.1449 99.0 58410 0.9832 0.7312
0.15 100.0 59000 0.9819 0.7303

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
11

Dataset used to train Onutoa/1_7e-3_10_0.1