Edit model card

1_8e-3_5_0.1

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9466
  • Accuracy: 0.7446

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.008
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
1.6159 1.0 590 1.3915 0.6214
1.4042 2.0 1180 1.2318 0.3810
1.2048 3.0 1770 0.9197 0.5642
1.2385 4.0 2360 0.9595 0.6220
1.1978 5.0 2950 1.2082 0.6220
1.1014 6.0 3540 1.3630 0.4590
1.0282 7.0 4130 1.1057 0.5538
0.9517 8.0 4720 0.9745 0.6789
0.9333 9.0 5310 0.7981 0.7040
0.8832 10.0 5900 0.7960 0.6979
0.8637 11.0 6490 0.7471 0.6920
0.8329 12.0 7080 0.7465 0.7104
0.7866 13.0 7670 0.7123 0.7034
0.7031 14.0 8260 0.8286 0.7089
0.6925 15.0 8850 0.7817 0.7061
0.6896 16.0 9440 0.7579 0.6963
0.6103 17.0 10030 0.8758 0.6563
0.6307 18.0 10620 1.1495 0.6211
0.5815 19.0 11210 0.7249 0.7315
0.554 20.0 11800 1.1488 0.6862
0.5376 21.0 12390 0.8074 0.7303
0.4969 22.0 12980 0.8280 0.6969
0.4813 23.0 13570 0.7972 0.7235
0.457 24.0 14160 0.8829 0.6807
0.4489 25.0 14750 0.7627 0.7303
0.4306 26.0 15340 0.9458 0.6945
0.4171 27.0 15930 1.0878 0.6823
0.4069 28.0 16520 0.8638 0.7125
0.3713 29.0 17110 0.9637 0.7306
0.3471 30.0 17700 0.8357 0.7205
0.341 31.0 18290 0.8430 0.7355
0.3677 32.0 18880 0.8911 0.7199
0.329 33.0 19470 1.0170 0.7
0.3019 34.0 20060 0.8981 0.7214
0.2912 35.0 20650 0.8809 0.7306
0.2962 36.0 21240 0.9446 0.7327
0.3018 37.0 21830 0.9218 0.7254
0.2793 38.0 22420 0.8054 0.7327
0.2786 39.0 23010 0.9709 0.7180
0.2608 40.0 23600 1.0428 0.7407
0.2705 41.0 24190 1.2935 0.7266
0.2551 42.0 24780 0.8896 0.7294
0.2383 43.0 25370 0.9849 0.7361
0.2306 44.0 25960 0.9547 0.7278
0.23 45.0 26550 0.9607 0.7373
0.2192 46.0 27140 0.9475 0.7248
0.2276 47.0 27730 0.9442 0.7333
0.2129 48.0 28320 0.9928 0.7294
0.2245 49.0 28910 0.9539 0.7324
0.2229 50.0 29500 0.9369 0.7245
0.2036 51.0 30090 1.0106 0.7239
0.206 52.0 30680 0.9619 0.7410
0.2056 53.0 31270 0.9298 0.7376
0.2007 54.0 31860 0.9451 0.7333
0.1953 55.0 32450 0.9762 0.7223
0.1992 56.0 33040 0.9447 0.7416
0.1806 57.0 33630 0.9956 0.7440
0.1859 58.0 34220 1.0206 0.7391
0.191 59.0 34810 0.9121 0.7385
0.1729 60.0 35400 0.9958 0.7278
0.1773 61.0 35990 0.9859 0.7428
0.1738 62.0 36580 0.9922 0.7398
0.1709 63.0 37170 0.9094 0.7419
0.1734 64.0 37760 0.9329 0.7431
0.1698 65.0 38350 0.9349 0.7391
0.1614 66.0 38940 1.0098 0.7327
0.1609 67.0 39530 0.9705 0.7269
0.1606 68.0 40120 0.9001 0.7425
0.1564 69.0 40710 0.9798 0.7407
0.1588 70.0 41300 0.9898 0.7382
0.1585 71.0 41890 0.9410 0.7410
0.1554 72.0 42480 0.9762 0.7404
0.1471 73.0 43070 0.9262 0.7401
0.1474 74.0 43660 0.8916 0.7410
0.1504 75.0 44250 0.9635 0.7385
0.1482 76.0 44840 0.9420 0.7413
0.1538 77.0 45430 0.9594 0.7413
0.1426 78.0 46020 0.9633 0.7440
0.1419 79.0 46610 0.9489 0.7437
0.1452 80.0 47200 0.9420 0.7398
0.1437 81.0 47790 0.9826 0.7410
0.1489 82.0 48380 0.9691 0.7453
0.1386 83.0 48970 0.9704 0.7398
0.1341 84.0 49560 0.8968 0.7398
0.1345 85.0 50150 0.9537 0.7367
0.1314 86.0 50740 0.9844 0.7453
0.1291 87.0 51330 0.9527 0.7379
0.1286 88.0 51920 0.9672 0.7419
0.1261 89.0 52510 0.9531 0.7379
0.1274 90.0 53100 0.9543 0.7419
0.1227 91.0 53690 0.9765 0.7422
0.1276 92.0 54280 0.9331 0.7388
0.1222 93.0 54870 0.9318 0.7425
0.1287 94.0 55460 0.9397 0.7437
0.1207 95.0 56050 0.9776 0.7410
0.1213 96.0 56640 0.9470 0.7462
0.1243 97.0 57230 0.9408 0.7428
0.1197 98.0 57820 0.9454 0.7450
0.1251 99.0 58410 0.9556 0.7428
0.1173 100.0 59000 0.9466 0.7446

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
4
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train Onutoa/1_8e-3_5_0.1