Edit model card

1_9e-3_1_0.5

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5179
  • Accuracy: 0.7373

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.009
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
1.1001 1.0 590 0.5777 0.6220
1.2048 2.0 1180 1.0112 0.3795
0.9325 3.0 1770 1.0442 0.6217
0.9146 4.0 2360 0.8508 0.4055
0.9521 5.0 2950 1.0733 0.6217
0.8385 6.0 3540 0.5157 0.6554
0.7334 7.0 4130 0.8676 0.4651
0.7376 8.0 4720 0.4932 0.6859
0.7303 9.0 5310 1.0567 0.6275
0.7214 10.0 5900 0.9606 0.6437
0.7229 11.0 6490 0.4985 0.6905
0.644 12.0 7080 0.5756 0.7021
0.6076 13.0 7670 0.6728 0.6657
0.5421 14.0 8260 0.4747 0.7095
0.5752 15.0 8850 0.5458 0.7116
0.5479 16.0 9440 0.5150 0.6618
0.5135 17.0 10030 0.4700 0.6869
0.4899 18.0 10620 0.4723 0.7061
0.4334 19.0 11210 0.5559 0.7165
0.4842 20.0 11800 0.5299 0.6691
0.4313 21.0 12390 0.5249 0.6716
0.407 22.0 12980 0.6987 0.6437
0.3823 23.0 13570 0.4630 0.7159
0.3595 24.0 14160 0.6790 0.7208
0.3512 25.0 14750 0.5064 0.7287
0.3337 26.0 15340 0.5864 0.6780
0.3325 27.0 15930 0.5088 0.7330
0.3344 28.0 16520 0.5736 0.6972
0.2922 29.0 17110 0.5337 0.7352
0.2869 30.0 17700 0.4824 0.7199
0.3026 31.0 18290 0.6410 0.6654
0.2685 32.0 18880 0.4831 0.7346
0.299 33.0 19470 0.8747 0.6297
0.262 34.0 20060 0.8211 0.6468
0.2678 35.0 20650 0.5408 0.7046
0.2416 36.0 21240 0.5116 0.7358
0.2587 37.0 21830 0.5482 0.7343
0.2598 38.0 22420 0.5146 0.7214
0.2278 39.0 23010 0.5172 0.7339
0.2255 40.0 23600 0.5711 0.7330
0.2483 41.0 24190 1.0653 0.6945
0.2219 42.0 24780 0.4959 0.7398
0.2278 43.0 25370 0.5879 0.7416
0.2082 44.0 25960 0.5285 0.7352
0.209 45.0 26550 0.6709 0.6780
0.1926 46.0 27140 0.6806 0.7330
0.2045 47.0 27730 0.5625 0.7272
0.1896 48.0 28320 0.6054 0.6994
0.2005 49.0 28910 0.5168 0.7235
0.1905 50.0 29500 0.5397 0.7281
0.1846 51.0 30090 0.5445 0.7309
0.1935 52.0 30680 0.5455 0.7422
0.1837 53.0 31270 0.6356 0.7398
0.1872 54.0 31860 0.5233 0.7431
0.1832 55.0 32450 0.5472 0.7321
0.192 56.0 33040 0.5430 0.7425
0.1704 57.0 33630 0.5549 0.7343
0.1714 58.0 34220 0.6204 0.7401
0.1693 59.0 34810 0.5923 0.7428
0.1781 60.0 35400 0.5394 0.7379
0.1672 61.0 35990 0.5550 0.7385
0.1721 62.0 36580 0.5416 0.7385
0.1644 63.0 37170 0.5342 0.7300
0.1656 64.0 37760 0.5541 0.7303
0.1635 65.0 38350 0.5548 0.7352
0.1603 66.0 38940 0.5550 0.7394
0.1581 67.0 39530 0.5891 0.7416
0.1552 68.0 40120 0.5385 0.7260
0.1527 69.0 40710 0.5636 0.7272
0.1501 70.0 41300 0.5427 0.7333
0.1584 71.0 41890 0.5466 0.7407
0.1507 72.0 42480 0.6263 0.7404
0.1404 73.0 43070 0.5403 0.7370
0.1423 74.0 43660 0.5633 0.7391
0.1517 75.0 44250 0.5960 0.7416
0.1493 76.0 44840 0.6246 0.7413
0.1416 77.0 45430 0.5413 0.7413
0.1446 78.0 46020 0.5421 0.7401
0.1404 79.0 46610 0.5650 0.7367
0.1425 80.0 47200 0.5943 0.7434
0.1338 81.0 47790 0.5297 0.7324
0.1323 82.0 48380 0.5296 0.7376
0.1431 83.0 48970 0.5224 0.7352
0.1335 84.0 49560 0.5220 0.7379
0.1337 85.0 50150 0.5239 0.7358
0.1337 86.0 50740 0.5371 0.7349
0.1307 87.0 51330 0.5485 0.7391
0.1299 88.0 51920 0.5426 0.7352
0.1343 89.0 52510 0.5219 0.7376
0.1293 90.0 53100 0.5667 0.7388
0.1306 91.0 53690 0.5384 0.7385
0.1301 92.0 54280 0.5179 0.7336
0.1263 93.0 54870 0.5233 0.7376
0.1269 94.0 55460 0.5338 0.7370
0.1249 95.0 56050 0.5242 0.7379
0.1215 96.0 56640 0.5158 0.7364
0.1248 97.0 57230 0.5197 0.7382
0.1203 98.0 57820 0.5132 0.7373
0.1209 99.0 58410 0.5176 0.7370
0.1222 100.0 59000 0.5179 0.7373

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
9
Inference API
This model can be loaded on Inference API (serverless).

Dataset used to train Onutoa/1_9e-3_1_0.5