Edit model card

1_6e-3_5_0.5

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8860
  • Accuracy: 0.7462

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.006
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
2.5324 1.0 590 2.6875 0.6217
2.4802 2.0 1180 3.4068 0.6214
2.6163 3.0 1770 3.8107 0.3841
2.2085 4.0 2360 2.0912 0.5021
2.1045 5.0 2950 1.6305 0.6394
1.7984 6.0 3540 1.8421 0.6352
1.7236 7.0 4130 1.3822 0.6550
1.6613 8.0 4720 1.3880 0.6939
1.5506 9.0 5310 2.7376 0.6498
1.6032 10.0 5900 1.9660 0.5471
1.4851 11.0 6490 1.2698 0.7015
1.3779 12.0 7080 1.1481 0.7070
1.315 13.0 7670 1.1203 0.6963
1.3238 14.0 8260 1.1089 0.7040
1.2662 15.0 8850 1.0526 0.7211
1.2489 16.0 9440 1.0878 0.6905
1.1504 17.0 10030 1.1004 0.7232
1.1289 18.0 10620 1.2881 0.6615
1.0159 19.0 11210 0.9890 0.7196
1.1298 20.0 11800 1.0623 0.7070
0.9891 21.0 12390 1.2508 0.7211
0.9865 22.0 12980 1.3142 0.6630
0.996 23.0 13570 1.0147 0.7125
0.9373 24.0 14160 1.0033 0.7281
0.9647 25.0 14750 2.0608 0.6920
0.8803 26.0 15340 0.9517 0.7312
0.8541 27.0 15930 0.9624 0.7266
0.8476 28.0 16520 0.9491 0.7239
0.8058 29.0 17110 0.9725 0.7385
0.8055 30.0 17700 0.9748 0.7248
0.788 31.0 18290 1.0021 0.7333
0.7576 32.0 18880 0.9257 0.7358
0.7698 33.0 19470 1.1881 0.6872
0.7371 34.0 20060 0.9496 0.7303
0.7355 35.0 20650 0.9241 0.7306
0.7062 36.0 21240 0.9682 0.7336
0.6691 37.0 21830 0.9349 0.7358
0.6613 38.0 22420 0.9785 0.7437
0.7068 39.0 23010 0.9227 0.7416
0.6189 40.0 23600 1.1750 0.7419
0.6352 41.0 24190 1.1787 0.7394
0.63 42.0 24780 0.9740 0.7422
0.6166 43.0 25370 1.2322 0.7376
0.6076 44.0 25960 0.9889 0.7260
0.6081 45.0 26550 1.2527 0.6783
0.5942 46.0 27140 0.9813 0.7214
0.5892 47.0 27730 0.9268 0.7391
0.5552 48.0 28320 0.9250 0.7425
0.5875 49.0 28910 0.9149 0.7306
0.5532 50.0 29500 0.9487 0.7272
0.5467 51.0 30090 0.9219 0.7355
0.5536 52.0 30680 0.9884 0.7431
0.5306 53.0 31270 1.0661 0.7165
0.5382 54.0 31860 0.9046 0.7379
0.5506 55.0 32450 1.0618 0.7150
0.5427 56.0 33040 0.9165 0.7434
0.513 57.0 33630 1.2612 0.7358
0.5008 58.0 34220 0.9674 0.7388
0.4962 59.0 34810 0.9219 0.7346
0.5079 60.0 35400 0.9093 0.7413
0.4973 61.0 35990 0.9088 0.7343
0.4938 62.0 36580 0.8926 0.7404
0.4984 63.0 37170 1.0869 0.7080
0.4907 64.0 37760 0.9026 0.7343
0.4727 65.0 38350 0.8803 0.7410
0.4667 66.0 38940 0.9391 0.7404
0.4706 67.0 39530 0.9321 0.7343
0.4696 68.0 40120 0.9011 0.7446
0.4471 69.0 40710 0.9192 0.7450
0.4535 70.0 41300 1.1121 0.7483
0.4664 71.0 41890 0.8832 0.7346
0.4462 72.0 42480 0.8937 0.7413
0.4247 73.0 43070 0.9067 0.7419
0.4218 74.0 43660 0.9289 0.7416
0.4553 75.0 44250 0.9095 0.7453
0.4485 76.0 44840 0.9062 0.7477
0.432 77.0 45430 0.8999 0.7394
0.4325 78.0 46020 0.8833 0.7523
0.4293 79.0 46610 0.9077 0.7495
0.4259 80.0 47200 0.9243 0.7440
0.4056 81.0 47790 0.9145 0.7431
0.424 82.0 48380 0.9100 0.7450
0.418 83.0 48970 0.9334 0.7532
0.4122 84.0 49560 0.9404 0.7511
0.4023 85.0 50150 0.9007 0.7443
0.4066 86.0 50740 0.9115 0.7474
0.4065 87.0 51330 0.9344 0.7443
0.4098 88.0 51920 0.9139 0.7453
0.3902 89.0 52510 0.9120 0.7398
0.3926 90.0 53100 0.9105 0.7425
0.3994 91.0 53690 0.9182 0.7394
0.3998 92.0 54280 0.8989 0.7446
0.3961 93.0 54870 0.9133 0.7446
0.3982 94.0 55460 0.8877 0.7428
0.3855 95.0 56050 0.9050 0.7480
0.3785 96.0 56640 0.8889 0.7456
0.3816 97.0 57230 0.8830 0.7431
0.377 98.0 57820 0.8847 0.7440
0.367 99.0 58410 0.8872 0.7456
0.3799 100.0 59000 0.8860 0.7462

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
1
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train Onutoa/1_6e-3_5_0.5