Edit model card

1_5e-3_10_0.1

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9323
  • Accuracy: 0.7440

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.005
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
1.4938 1.0 590 2.3680 0.6217
1.4628 2.0 1180 1.4346 0.6217
1.3133 3.0 1770 0.9694 0.6260
1.268 4.0 2360 1.1126 0.6217
1.0913 5.0 2950 0.9254 0.6587
1.0518 6.0 3540 0.8635 0.6593
1.0309 7.0 4130 1.3201 0.5049
0.9539 8.0 4720 0.8164 0.6801
0.9364 9.0 5310 1.3605 0.6254
0.9419 10.0 5900 0.7974 0.6844
0.9314 11.0 6490 1.3755 0.5486
0.8216 12.0 7080 0.7721 0.7012
0.8582 13.0 7670 0.7902 0.6902
0.7695 14.0 8260 0.7552 0.6945
0.7901 15.0 8850 0.8217 0.7144
0.7382 16.0 9440 0.8028 0.6844
0.7009 17.0 10030 0.7778 0.6994
0.6922 18.0 10620 0.9600 0.6688
0.6409 19.0 11210 0.8214 0.7104
0.6419 20.0 11800 1.1320 0.7031
0.6205 21.0 12390 0.7671 0.7232
0.6242 22.0 12980 0.8438 0.7208
0.5749 23.0 13570 1.0312 0.7214
0.5669 24.0 14160 0.7602 0.7242
0.5499 25.0 14750 0.8538 0.7294
0.5258 26.0 15340 1.5849 0.5807
0.5256 27.0 15930 0.8285 0.7306
0.491 28.0 16520 0.8039 0.7180
0.4844 29.0 17110 0.7899 0.7382
0.4584 30.0 17700 0.8144 0.7309
0.4602 31.0 18290 1.0077 0.7196
0.4467 32.0 18880 0.9234 0.7306
0.4267 33.0 19470 0.8644 0.7174
0.4031 34.0 20060 0.8536 0.7226
0.3862 35.0 20650 0.8552 0.7385
0.3811 36.0 21240 0.9266 0.7373
0.3814 37.0 21830 0.9688 0.7147
0.3613 38.0 22420 0.8678 0.7434
0.3528 39.0 23010 0.8885 0.7309
0.3563 40.0 23600 0.9239 0.7446
0.3507 41.0 24190 0.9006 0.7450
0.3437 42.0 24780 1.0086 0.7281
0.3138 43.0 25370 0.9287 0.7361
0.3208 44.0 25960 0.9420 0.7318
0.3214 45.0 26550 0.9205 0.7339
0.3013 46.0 27140 0.9259 0.7248
0.3066 47.0 27730 0.8718 0.7388
0.2987 48.0 28320 0.9665 0.7214
0.3116 49.0 28910 0.9426 0.7410
0.2766 50.0 29500 0.8971 0.7428
0.2683 51.0 30090 1.0176 0.7437
0.27 52.0 30680 0.9311 0.7382
0.2653 53.0 31270 0.9399 0.7336
0.2583 54.0 31860 0.8990 0.7281
0.2582 55.0 32450 0.9761 0.7419
0.2616 56.0 33040 0.8687 0.7480
0.2401 57.0 33630 0.9587 0.7266
0.2426 58.0 34220 0.9359 0.7474
0.2466 59.0 34810 0.9008 0.7385
0.2351 60.0 35400 0.9119 0.7462
0.237 61.0 35990 0.9495 0.7425
0.2329 62.0 36580 0.9731 0.7446
0.2235 63.0 37170 0.9495 0.7379
0.2251 64.0 37760 0.9236 0.7343
0.2235 65.0 38350 0.9289 0.7483
0.2237 66.0 38940 0.9300 0.7364
0.2159 67.0 39530 0.9430 0.7434
0.2201 68.0 40120 0.9144 0.7453
0.2075 69.0 40710 0.9126 0.7477
0.2195 70.0 41300 0.9387 0.7529
0.2036 71.0 41890 0.9798 0.7349
0.2116 72.0 42480 1.0175 0.7492
0.1953 73.0 43070 0.9082 0.7498
0.2003 74.0 43660 0.9919 0.7443
0.2016 75.0 44250 0.9649 0.7453
0.1997 76.0 44840 0.9454 0.7398
0.2065 77.0 45430 0.9424 0.7440
0.1983 78.0 46020 0.9516 0.7361
0.1937 79.0 46610 0.9370 0.7404
0.1826 80.0 47200 0.9395 0.7468
0.1816 81.0 47790 0.9566 0.7446
0.1931 82.0 48380 0.9800 0.7508
0.1866 83.0 48970 0.9390 0.7459
0.1831 84.0 49560 0.9383 0.7440
0.183 85.0 50150 0.9607 0.7422
0.1825 86.0 50740 0.9608 0.7468
0.178 87.0 51330 0.9820 0.7456
0.1787 88.0 51920 0.9598 0.7468
0.1745 89.0 52510 0.9419 0.7483
0.1739 90.0 53100 0.9796 0.7495
0.1752 91.0 53690 0.9373 0.7477
0.1773 92.0 54280 0.9310 0.7370
0.1699 93.0 54870 0.9641 0.7456
0.1738 94.0 55460 0.9418 0.7468
0.1698 95.0 56050 0.9586 0.7450
0.1658 96.0 56640 0.9488 0.7468
0.1658 97.0 57230 0.9450 0.7471
0.1641 98.0 57820 0.9307 0.7459
0.1695 99.0 58410 0.9487 0.7443
0.1653 100.0 59000 0.9323 0.7440

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
4
Inference API
This model can be loaded on Inference API (serverless).

Dataset used to train Onutoa/1_5e-3_10_0.1