Edit model card

1_6e-3_5_0.1

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9459
  • Accuracy: 0.7379

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.006
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
1.3359 1.0 590 1.9035 0.3798
1.5253 2.0 1180 0.9944 0.6217
1.219 3.0 1770 0.8943 0.6190
1.1715 4.0 2360 0.9268 0.6205
1.0411 5.0 2950 0.8576 0.6220
1.0356 6.0 3540 0.9342 0.6067
0.9697 7.0 4130 1.9873 0.4131
0.9799 8.0 4720 1.7366 0.4492
0.9846 9.0 5310 1.3262 0.6330
0.9154 10.0 5900 1.0899 0.5697
0.8903 11.0 6490 0.8476 0.6242
0.8245 12.0 7080 0.9154 0.6902
0.8927 13.0 7670 0.7204 0.6930
0.7654 14.0 8260 0.8502 0.6908
0.7533 15.0 8850 0.9376 0.6398
0.8225 16.0 9440 0.7376 0.7073
0.6919 17.0 10030 1.2361 0.5688
0.6861 18.0 10620 1.1219 0.6116
0.6514 19.0 11210 0.7409 0.7073
0.669 20.0 11800 1.1160 0.6379
0.6611 21.0 12390 0.8790 0.7156
0.6422 22.0 12980 0.9649 0.6550
0.5883 23.0 13570 1.1373 0.6324
0.5804 24.0 14160 1.2809 0.6156
0.5509 25.0 14750 0.8749 0.7229
0.5318 26.0 15340 0.8741 0.6969
0.5223 27.0 15930 0.7777 0.7168
0.4971 28.0 16520 0.8501 0.6985
0.4599 29.0 17110 0.8999 0.7156
0.4617 30.0 17700 0.8970 0.7297
0.4523 31.0 18290 0.9297 0.7171
0.4334 32.0 18880 0.9673 0.7315
0.4215 33.0 19470 0.8755 0.7263
0.4088 34.0 20060 0.9157 0.6988
0.3842 35.0 20650 1.0157 0.7349
0.3913 36.0 21240 0.8419 0.7300
0.3737 37.0 21830 0.7792 0.7266
0.373 38.0 22420 0.8775 0.7257
0.3718 39.0 23010 0.8662 0.7309
0.3449 40.0 23600 0.9173 0.7257
0.3585 41.0 24190 0.8719 0.7339
0.3299 42.0 24780 0.9434 0.7208
0.3137 43.0 25370 0.9660 0.7324
0.3228 44.0 25960 0.8873 0.7266
0.3134 45.0 26550 0.8953 0.7202
0.2873 46.0 27140 0.8243 0.7297
0.301 47.0 27730 0.8633 0.7324
0.271 48.0 28320 0.9646 0.7217
0.2907 49.0 28910 0.9321 0.7318
0.2785 50.0 29500 0.8440 0.7407
0.2554 51.0 30090 1.0258 0.7116
0.2715 52.0 30680 0.9458 0.7223
0.2556 53.0 31270 0.8895 0.7450
0.2488 54.0 31860 0.8865 0.7410
0.2528 55.0 32450 0.9360 0.7330
0.2444 56.0 33040 1.0095 0.7373
0.2391 57.0 33630 0.9704 0.7428
0.2386 58.0 34220 0.9717 0.7401
0.2193 59.0 34810 0.9480 0.7434
0.2338 60.0 35400 1.0054 0.7315
0.229 61.0 35990 0.8469 0.7361
0.2187 62.0 36580 0.8841 0.7324
0.2127 63.0 37170 0.9744 0.7260
0.2142 64.0 37760 0.9097 0.7407
0.2138 65.0 38350 0.9503 0.7281
0.2078 66.0 38940 0.8941 0.7379
0.2027 67.0 39530 0.8893 0.7379
0.2019 68.0 40120 0.9128 0.7333
0.1911 69.0 40710 0.9662 0.7382
0.2022 70.0 41300 1.0329 0.7388
0.1882 71.0 41890 0.9666 0.7232
0.2163 72.0 42480 0.9655 0.7333
0.1884 73.0 43070 0.9855 0.7254
0.1947 74.0 43660 0.9542 0.7324
0.1861 75.0 44250 0.9777 0.7413
0.1833 76.0 44840 0.9576 0.7388
0.1861 77.0 45430 0.9108 0.7404
0.1838 78.0 46020 0.9292 0.7352
0.1764 79.0 46610 0.9273 0.7413
0.1752 80.0 47200 0.9498 0.7355
0.1709 81.0 47790 0.9724 0.7343
0.1722 82.0 48380 0.8921 0.7364
0.1701 83.0 48970 1.0262 0.7398
0.168 84.0 49560 0.9239 0.7346
0.1633 85.0 50150 0.9714 0.7349
0.1666 86.0 50740 0.9723 0.7398
0.1634 87.0 51330 0.9497 0.7419
0.1657 88.0 51920 0.9417 0.7358
0.1481 89.0 52510 0.9709 0.7419
0.1557 90.0 53100 0.9928 0.7312
0.1567 91.0 53690 0.9443 0.7388
0.1568 92.0 54280 0.9285 0.7367
0.1579 93.0 54870 0.9201 0.7376
0.157 94.0 55460 0.9334 0.7376
0.1515 95.0 56050 0.9646 0.7394
0.1486 96.0 56640 0.9589 0.7385
0.1468 97.0 57230 0.9423 0.7379
0.1453 98.0 57820 0.9497 0.7382
0.1406 99.0 58410 0.9602 0.7373
0.1467 100.0 59000 0.9459 0.7379

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
2
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train Onutoa/1_6e-3_5_0.1