Edit model card

1_5e-3_10_0.9

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9486
  • Accuracy: 0.7520

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.005
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
4.2553 1.0 590 3.4885 0.6217
3.8771 2.0 1180 5.2589 0.4156
3.8841 3.0 1770 3.1457 0.6217
3.4978 4.0 2360 3.6630 0.5073
3.4514 5.0 2950 2.8535 0.6538
2.8512 6.0 3540 4.5431 0.6401
2.8629 7.0 4130 2.9999 0.5774
2.7803 8.0 4720 4.0455 0.6440
2.3648 9.0 5310 3.4814 0.6618
2.3135 10.0 5900 1.8693 0.6985
2.2615 11.0 6490 1.7206 0.7095
1.938 12.0 7080 2.2772 0.6664
1.9168 13.0 7670 1.5057 0.7012
1.7411 14.0 8260 1.4510 0.7239
1.7184 15.0 8850 1.3241 0.7211
1.5774 16.0 9440 1.8563 0.7153
1.5229 17.0 10030 1.3243 0.7226
1.4652 18.0 10620 1.3866 0.7333
1.4321 19.0 11210 1.2208 0.7294
1.4205 20.0 11800 1.4391 0.7080
1.3537 21.0 12390 1.2900 0.7382
1.3302 22.0 12980 1.2322 0.7398
1.2616 23.0 13570 1.2189 0.7391
1.2586 24.0 14160 1.1687 0.7410
1.2259 25.0 14750 1.1797 0.7336
1.1804 26.0 15340 1.0929 0.7394
1.1907 27.0 15930 1.2820 0.7168
1.2066 28.0 16520 1.2464 0.7422
1.1128 29.0 17110 1.1798 0.7180
1.0889 30.0 17700 1.1373 0.7474
1.0637 31.0 18290 1.0453 0.7382
1.058 32.0 18880 1.1689 0.7446
1.0553 33.0 19470 1.0705 0.7321
1.0404 34.0 20060 1.0731 0.7425
1.014 35.0 20650 1.0481 0.7459
1.0166 36.0 21240 1.0434 0.7508
0.9983 37.0 21830 1.1358 0.7471
1.0144 38.0 22420 1.0030 0.7425
1.0236 39.0 23010 1.2874 0.7437
0.9749 40.0 23600 1.3199 0.7370
0.9592 41.0 24190 1.0072 0.7352
0.9467 42.0 24780 1.0282 0.7422
0.921 43.0 25370 1.3284 0.7446
0.9328 44.0 25960 0.9873 0.7364
0.9192 45.0 26550 1.3185 0.7425
0.8882 46.0 27140 0.9961 0.7453
0.8986 47.0 27730 0.9880 0.7373
0.8635 48.0 28320 1.0019 0.7480
0.8988 49.0 28910 1.1254 0.7498
0.865 50.0 29500 0.9619 0.7468
0.8575 51.0 30090 1.0854 0.7502
0.8654 52.0 30680 0.9466 0.7462
0.8482 53.0 31270 1.0722 0.7483
0.8547 54.0 31860 1.1340 0.7492
0.8424 55.0 32450 1.0683 0.7462
0.8078 56.0 33040 1.0285 0.7495
0.8163 57.0 33630 0.9779 0.7502
0.8175 58.0 34220 0.9461 0.7505
0.816 59.0 34810 0.9991 0.7443
0.8123 60.0 35400 0.9554 0.7443
0.7827 61.0 35990 0.9765 0.7492
0.8139 62.0 36580 1.1876 0.7547
0.7938 63.0 37170 0.9484 0.7541
0.7712 64.0 37760 0.9400 0.7508
0.7834 65.0 38350 0.9793 0.7532
0.781 66.0 38940 0.9480 0.7498
0.7639 67.0 39530 1.1188 0.7593
0.7838 68.0 40120 1.0215 0.7541
0.7527 69.0 40710 1.0855 0.7529
0.7626 70.0 41300 1.0755 0.7526
0.7683 71.0 41890 0.9553 0.7566
0.7588 72.0 42480 0.9822 0.7581
0.7377 73.0 43070 1.0359 0.7557
0.731 74.0 43660 0.9513 0.7505
0.7536 75.0 44250 1.1317 0.7505
0.7449 76.0 44840 0.9001 0.7532
0.7428 77.0 45430 1.0150 0.7538
0.7271 78.0 46020 0.9623 0.7563
0.7383 79.0 46610 0.9535 0.7584
0.7186 80.0 47200 0.9970 0.7581
0.6823 81.0 47790 1.0485 0.7563
0.7259 82.0 48380 0.9706 0.7526
0.7039 83.0 48970 0.9543 0.7480
0.7259 84.0 49560 0.9387 0.7508
0.7092 85.0 50150 0.9828 0.7538
0.7259 86.0 50740 0.9145 0.7459
0.7195 87.0 51330 0.9313 0.7495
0.696 88.0 51920 0.9467 0.7492
0.6885 89.0 52510 0.9671 0.7526
0.6874 90.0 53100 0.9387 0.7511
0.6911 91.0 53690 1.0279 0.7492
0.6968 92.0 54280 0.9268 0.7511
0.6833 93.0 54870 0.9886 0.7517
0.7096 94.0 55460 0.9693 0.7532
0.6911 95.0 56050 0.9503 0.7547
0.6754 96.0 56640 0.9451 0.7544
0.6823 97.0 57230 0.9427 0.7535
0.6547 98.0 57820 0.9500 0.7526
0.6433 99.0 58410 0.9280 0.7505
0.6722 100.0 59000 0.9486 0.7520

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
4
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train Onutoa/1_5e-3_10_0.9