Edit model card

1_1e-2_5_0.1

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8567
  • Accuracy: 0.7480

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.01
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
1.682 1.0 590 2.1411 0.6208
1.4095 2.0 1180 1.3977 0.3817
1.1425 3.0 1770 0.8850 0.5963
1.1284 4.0 2360 0.8549 0.6333
0.9827 5.0 2950 0.8314 0.6511
1.4181 6.0 3540 1.6014 0.3835
1.0353 7.0 4130 1.5568 0.4235
0.8632 8.0 4720 0.9442 0.6394
0.8723 9.0 5310 0.7750 0.6905
0.8161 10.0 5900 0.7561 0.6957
0.7785 11.0 6490 0.7662 0.6752
0.7497 12.0 7080 0.7282 0.6966
0.7437 13.0 7670 0.7389 0.6798
0.7156 14.0 8260 0.7087 0.7043
0.6893 15.0 8850 0.7195 0.7034
0.6787 16.0 9440 0.6835 0.7174
0.6392 17.0 10030 0.6839 0.7162
0.6287 18.0 10620 0.8835 0.6587
0.6247 19.0 11210 0.6814 0.7248
0.5969 20.0 11800 0.7200 0.7119
0.5621 21.0 12390 0.6906 0.7284
0.5461 22.0 12980 0.7080 0.7202
0.5147 23.0 13570 0.7483 0.7281
0.5098 24.0 14160 0.7129 0.7177
0.4893 25.0 14750 0.7235 0.7346
0.4723 26.0 15340 1.1308 0.6437
0.4619 27.0 15930 0.7328 0.7254
0.438 28.0 16520 0.8303 0.7422
0.4216 29.0 17110 0.7223 0.7410
0.4079 30.0 17700 0.7778 0.7315
0.3803 31.0 18290 0.7576 0.7318
0.3871 32.0 18880 0.8276 0.7382
0.3846 33.0 19470 0.8631 0.7110
0.3561 34.0 20060 0.8310 0.7211
0.344 35.0 20650 0.7655 0.7364
0.3333 36.0 21240 0.7666 0.7404
0.3287 37.0 21830 0.8005 0.7315
0.3193 38.0 22420 0.8775 0.7443
0.3051 39.0 23010 0.8466 0.7428
0.3019 40.0 23600 0.8328 0.7394
0.2922 41.0 24190 0.8150 0.7382
0.3064 42.0 24780 0.8742 0.7376
0.2841 43.0 25370 0.7898 0.7361
0.2841 44.0 25960 0.8226 0.7401
0.2679 45.0 26550 0.8297 0.7318
0.2651 46.0 27140 0.8316 0.7388
0.2654 47.0 27730 0.8553 0.7364
0.2457 48.0 28320 0.8647 0.7327
0.2558 49.0 28910 0.8399 0.7376
0.2467 50.0 29500 0.8517 0.7391
0.2278 51.0 30090 0.8409 0.7275
0.2343 52.0 30680 0.9442 0.7214
0.2372 53.0 31270 0.8661 0.7300
0.2194 54.0 31860 0.8430 0.7407
0.2222 55.0 32450 0.9235 0.7242
0.2328 56.0 33040 0.8637 0.7367
0.2162 57.0 33630 0.9162 0.7211
0.215 58.0 34220 0.8886 0.7281
0.206 59.0 34810 0.9033 0.7193
0.2099 60.0 35400 0.8829 0.7361
0.2081 61.0 35990 0.8874 0.7367
0.2105 62.0 36580 0.8902 0.7361
0.1899 63.0 37170 0.8541 0.7376
0.1972 64.0 37760 0.8740 0.7437
0.191 65.0 38350 0.8897 0.7413
0.1908 66.0 38940 0.8672 0.7437
0.1894 67.0 39530 0.8892 0.7364
0.1887 68.0 40120 0.8750 0.7407
0.1757 69.0 40710 0.8887 0.7379
0.1791 70.0 41300 0.8757 0.7413
0.1848 71.0 41890 0.8498 0.7437
0.1878 72.0 42480 0.8647 0.7413
0.1811 73.0 43070 0.8715 0.7391
0.1681 74.0 43660 0.9104 0.7416
0.1693 75.0 44250 0.9140 0.7434
0.1778 76.0 44840 0.8656 0.7437
0.1671 77.0 45430 0.8830 0.7413
0.1698 78.0 46020 0.8819 0.7431
0.1641 79.0 46610 0.8667 0.7391
0.1572 80.0 47200 0.8677 0.7419
0.1552 81.0 47790 0.8704 0.7404
0.1543 82.0 48380 0.8640 0.7489
0.1576 83.0 48970 0.8897 0.7459
0.153 84.0 49560 0.8649 0.7465
0.1536 85.0 50150 0.8864 0.7437
0.1548 86.0 50740 0.9050 0.7468
0.144 87.0 51330 0.8696 0.7401
0.151 88.0 51920 0.8987 0.7446
0.1493 89.0 52510 0.8938 0.7431
0.1455 90.0 53100 0.8726 0.7431
0.1414 91.0 53690 0.8814 0.7416
0.1422 92.0 54280 0.8838 0.7419
0.1421 93.0 54870 0.8648 0.7465
0.1477 94.0 55460 0.8532 0.7450
0.1431 95.0 56050 0.8613 0.7465
0.1412 96.0 56640 0.8708 0.7471
0.1413 97.0 57230 0.8656 0.7468
0.1375 98.0 57820 0.8647 0.7468
0.1389 99.0 58410 0.8590 0.7483
0.1389 100.0 59000 0.8567 0.7480

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
4
Inference API
This model can be loaded on Inference API (serverless).

Dataset used to train Onutoa/1_1e-2_5_0.1