Edit model card

1_8e-3_5_0.5

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9097
  • Accuracy: 0.7502

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.008
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
2.7895 1.0 590 1.8785 0.6150
2.562 2.0 1180 2.8327 0.4046
2.4023 3.0 1770 2.0853 0.5217
2.3167 4.0 2360 1.5879 0.6505
2.161 5.0 2950 1.9917 0.4914
1.794 6.0 3540 2.5834 0.5110
1.9698 7.0 4130 3.1462 0.4927
1.5971 8.0 4720 1.6865 0.5966
1.5201 9.0 5310 3.4553 0.6413
1.5841 10.0 5900 3.1799 0.6327
1.5231 11.0 6490 1.1451 0.6933
1.3941 12.0 7080 1.1390 0.6884
1.3679 13.0 7670 1.4767 0.6902
1.2653 14.0 8260 1.5274 0.7028
1.2451 15.0 8850 1.6725 0.7073
1.255 16.0 9440 1.5284 0.7012
1.184 17.0 10030 1.0831 0.6979
1.1215 18.0 10620 2.0515 0.5755
1.0766 19.0 11210 1.1808 0.7263
1.1108 20.0 11800 1.0647 0.7190
1.0272 21.0 12390 1.2527 0.6654
1.036 22.0 12980 1.1910 0.6783
0.9735 23.0 13570 1.0311 0.7037
0.9167 24.0 14160 0.9997 0.7021
0.8494 25.0 14750 1.0338 0.7284
0.8461 26.0 15340 1.4642 0.6495
0.8466 27.0 15930 0.9877 0.7370
0.8498 28.0 16520 0.9401 0.7287
0.7851 29.0 17110 1.0208 0.7336
0.7796 30.0 17700 0.9350 0.7232
0.7725 31.0 18290 1.4097 0.7162
0.7599 32.0 18880 1.1313 0.7333
0.768 33.0 19470 1.0272 0.7379
0.7007 34.0 20060 0.9294 0.7364
0.6718 35.0 20650 0.9347 0.7330
0.6786 36.0 21240 1.0231 0.7416
0.6822 37.0 21830 0.9767 0.7413
0.6667 38.0 22420 0.9351 0.7272
0.6497 39.0 23010 0.9574 0.7355
0.638 40.0 23600 1.0610 0.7437
0.6468 41.0 24190 1.1462 0.7434
0.6046 42.0 24780 0.9750 0.7211
0.6079 43.0 25370 1.2040 0.7419
0.5806 44.0 25960 1.1603 0.7018
0.5753 45.0 26550 1.0639 0.7110
0.5693 46.0 27140 1.0966 0.7422
0.5757 47.0 27730 1.0137 0.7468
0.5692 48.0 28320 0.9476 0.7382
0.5732 49.0 28910 1.0004 0.7291
0.5563 50.0 29500 0.9870 0.7394
0.5217 51.0 30090 0.9681 0.7312
0.5239 52.0 30680 0.9812 0.7456
0.525 53.0 31270 1.0355 0.7196
0.5136 54.0 31860 0.9161 0.7385
0.5249 55.0 32450 1.0093 0.7382
0.5092 56.0 33040 1.0072 0.7428
0.4754 57.0 33630 1.0560 0.7425
0.4716 58.0 34220 0.9922 0.7425
0.4913 59.0 34810 1.0014 0.7480
0.4773 60.0 35400 0.9148 0.7352
0.4725 61.0 35990 0.9691 0.7474
0.4656 62.0 36580 0.9459 0.7453
0.4565 63.0 37170 0.9521 0.7388
0.4502 64.0 37760 1.0172 0.7474
0.4765 65.0 38350 0.9504 0.7327
0.4439 66.0 38940 0.9998 0.7443
0.4424 67.0 39530 1.0985 0.7498
0.4541 68.0 40120 0.9088 0.7446
0.4321 69.0 40710 0.9322 0.7379
0.4346 70.0 41300 1.0028 0.7495
0.4329 71.0 41890 0.8949 0.7385
0.4344 72.0 42480 0.9631 0.7544
0.4111 73.0 43070 0.9800 0.7272
0.4183 74.0 43660 1.1350 0.7541
0.4234 75.0 44250 0.9444 0.7511
0.4297 76.0 44840 0.9584 0.7526
0.4172 77.0 45430 0.9165 0.7413
0.4083 78.0 46020 0.9103 0.7401
0.4078 79.0 46610 0.9100 0.7468
0.3977 80.0 47200 0.9172 0.7480
0.3885 81.0 47790 0.9714 0.7523
0.4012 82.0 48380 1.0683 0.7547
0.3831 83.0 48970 0.9867 0.7575
0.3878 84.0 49560 0.9245 0.7541
0.3841 85.0 50150 0.9662 0.7327
0.3835 86.0 50740 0.9532 0.7505
0.3755 87.0 51330 0.9645 0.7492
0.379 88.0 51920 0.9183 0.7483
0.38 89.0 52510 0.9787 0.7523
0.37 90.0 53100 0.9205 0.7443
0.368 91.0 53690 0.9236 0.7446
0.3737 92.0 54280 0.9023 0.7419
0.3663 93.0 54870 0.9200 0.7514
0.3763 94.0 55460 0.9496 0.7517
0.3635 95.0 56050 0.9487 0.7508
0.3656 96.0 56640 0.9122 0.7502
0.3604 97.0 57230 0.9036 0.7498
0.3475 98.0 57820 0.9054 0.7474
0.3552 99.0 58410 0.9078 0.7471
0.3564 100.0 59000 0.9097 0.7502

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
3
Inference API
This model can be loaded on Inference API (serverless).

Dataset used to train Onutoa/1_8e-3_5_0.5