Edit model card

1_1e-2_1_0.5

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4701
  • Accuracy: 0.7431

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.01
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
1.2311 1.0 590 1.5093 0.6217
1.0444 2.0 1180 0.5788 0.6196
0.9287 3.0 1770 1.3468 0.6217
0.8066 4.0 2360 0.7094 0.6217
0.6756 5.0 2950 0.5829 0.6486
0.5869 6.0 3540 0.5398 0.6670
0.5733 7.0 4130 0.6279 0.5716
0.5229 8.0 4720 0.4543 0.7061
0.4998 9.0 5310 0.4906 0.6685
0.476 10.0 5900 0.5972 0.6927
0.4498 11.0 6490 0.4602 0.7049
0.4082 12.0 7080 0.4432 0.7012
0.4072 13.0 7670 0.4585 0.6963
0.3746 14.0 8260 0.4281 0.7312
0.3652 15.0 8850 0.4691 0.7294
0.3505 16.0 9440 0.4156 0.7303
0.3375 17.0 10030 0.4299 0.7275
0.3298 18.0 10620 0.4948 0.7
0.3056 19.0 11210 0.4208 0.7275
0.2956 20.0 11800 0.4474 0.7324
0.2859 21.0 12390 0.5893 0.6746
0.2807 22.0 12980 0.4613 0.7291
0.2566 23.0 13570 0.4610 0.7235
0.249 24.0 14160 0.5434 0.7413
0.2391 25.0 14750 0.5110 0.7333
0.2421 26.0 15340 0.6915 0.6465
0.2556 27.0 15930 0.4759 0.7306
0.2271 28.0 16520 0.4690 0.7321
0.2295 29.0 17110 0.5012 0.7376
0.2283 30.0 17700 0.5150 0.7128
0.2054 31.0 18290 0.4737 0.7343
0.2157 32.0 18880 0.6032 0.7327
0.215 33.0 19470 0.4818 0.7297
0.196 34.0 20060 0.4894 0.7147
0.2001 35.0 20650 0.5326 0.7193
0.1955 36.0 21240 0.4826 0.7413
0.1947 37.0 21830 0.4625 0.7385
0.1912 38.0 22420 0.4764 0.7492
0.1946 39.0 23010 0.5615 0.7443
0.1898 40.0 23600 0.4870 0.7413
0.1789 41.0 24190 0.5526 0.7462
0.1803 42.0 24780 0.5021 0.7217
0.1708 43.0 25370 0.4751 0.7379
0.1835 44.0 25960 0.4738 0.7355
0.1738 45.0 26550 0.4759 0.7336
0.1726 46.0 27140 0.4928 0.7367
0.1756 47.0 27730 0.5380 0.7193
0.1617 48.0 28320 0.5119 0.7327
0.1725 49.0 28910 0.4884 0.7431
0.1643 50.0 29500 0.4968 0.7382
0.1593 51.0 30090 0.4708 0.7281
0.1645 52.0 30680 0.4943 0.7364
0.1566 53.0 31270 0.4820 0.7446
0.1555 54.0 31860 0.5117 0.7376
0.1584 55.0 32450 0.5269 0.7410
0.1587 56.0 33040 0.4650 0.7394
0.1527 57.0 33630 0.5007 0.7431
0.157 58.0 34220 0.4689 0.7413
0.1527 59.0 34810 0.4960 0.7306
0.1461 60.0 35400 0.5033 0.7416
0.1506 61.0 35990 0.4817 0.7459
0.153 62.0 36580 0.4782 0.7422
0.1417 63.0 37170 0.4808 0.7410
0.1477 64.0 37760 0.5090 0.7358
0.1467 65.0 38350 0.5180 0.7419
0.1416 66.0 38940 0.5055 0.7483
0.1407 67.0 39530 0.4779 0.7416
0.1407 68.0 40120 0.4661 0.7401
0.1379 69.0 40710 0.5172 0.7450
0.1432 70.0 41300 0.4883 0.7422
0.1455 71.0 41890 0.4853 0.7382
0.1348 72.0 42480 0.4934 0.7465
0.134 73.0 43070 0.4773 0.7462
0.1323 74.0 43660 0.5033 0.7428
0.1356 75.0 44250 0.5184 0.7483
0.1321 76.0 44840 0.4860 0.7382
0.1328 77.0 45430 0.4800 0.7422
0.1334 78.0 46020 0.4668 0.7489
0.128 79.0 46610 0.4930 0.7498
0.1315 80.0 47200 0.4808 0.7410
0.1236 81.0 47790 0.4718 0.7456
0.1286 82.0 48380 0.4723 0.7413
0.1264 83.0 48970 0.4987 0.7480
0.1273 84.0 49560 0.4582 0.7492
0.1243 85.0 50150 0.4713 0.7471
0.1286 86.0 50740 0.4913 0.7437
0.1186 87.0 51330 0.4953 0.7495
0.1194 88.0 51920 0.4805 0.7486
0.118 89.0 52510 0.4799 0.7474
0.1236 90.0 53100 0.4829 0.7471
0.1201 91.0 53690 0.4736 0.7474
0.1235 92.0 54280 0.4695 0.7431
0.1214 93.0 54870 0.4781 0.7446
0.1188 94.0 55460 0.4701 0.7456
0.1191 95.0 56050 0.4681 0.7456
0.1144 96.0 56640 0.4737 0.7453
0.1212 97.0 57230 0.4736 0.7446
0.1152 98.0 57820 0.4668 0.7410
0.1153 99.0 58410 0.4743 0.7437
0.1194 100.0 59000 0.4701 0.7431

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
4
Inference API
This model can be loaded on Inference API (serverless).

Dataset used to train Onutoa/1_1e-2_1_0.5