Edit model card

1_7e-3_10_0.5

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9382
  • Accuracy: 0.7557

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.007
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
2.7912 1.0 590 2.5545 0.3872
3.233 2.0 1180 2.8480 0.6217
2.7249 3.0 1770 2.7584 0.4037
2.5026 4.0 2360 1.8755 0.6113
2.235 5.0 2950 1.6668 0.6661
1.9303 6.0 3540 1.6441 0.6346
1.9491 7.0 4130 2.1352 0.5789
1.6294 8.0 4720 2.2811 0.6572
1.6591 9.0 5310 1.5834 0.6896
1.5251 10.0 5900 1.7600 0.6716
1.5112 11.0 6490 1.2400 0.6905
1.3972 12.0 7080 1.2023 0.7165
1.3804 13.0 7670 1.1972 0.7009
1.3085 14.0 8260 1.6154 0.7101
1.2559 15.0 8850 1.1741 0.7
1.2292 16.0 9440 1.1551 0.7028
1.1711 17.0 10030 1.9400 0.6242
1.1356 18.0 10620 1.1234 0.7165
1.0466 19.0 11210 1.0939 0.7312
1.1043 20.0 11800 1.2564 0.7183
0.9875 21.0 12390 1.1273 0.7135
0.9788 22.0 12980 1.0513 0.7187
0.9086 23.0 13570 1.0497 0.7312
0.9327 24.0 14160 1.1127 0.7046
0.8835 25.0 14750 1.3732 0.7235
0.8652 26.0 15340 1.6447 0.6511
0.843 27.0 15930 1.1686 0.7425
0.8072 28.0 16520 1.0110 0.7446
0.7735 29.0 17110 1.1610 0.7401
0.7717 30.0 17700 0.9851 0.7352
0.7746 31.0 18290 1.4960 0.7223
0.7439 32.0 18880 0.9772 0.7358
0.7534 33.0 19470 1.0034 0.7456
0.6874 34.0 20060 0.9894 0.7407
0.6877 35.0 20650 1.4460 0.6771
0.6816 36.0 21240 1.0221 0.7489
0.7158 37.0 21830 1.3579 0.7425
0.6694 38.0 22420 1.1472 0.7517
0.6586 39.0 23010 1.0499 0.7523
0.6418 40.0 23600 1.0344 0.7459
0.6366 41.0 24190 1.2582 0.7422
0.6289 42.0 24780 0.9833 0.7370
0.6065 43.0 25370 1.0209 0.7529
0.6053 44.0 25960 1.0147 0.7287
0.5958 45.0 26550 0.9454 0.7456
0.5637 46.0 27140 0.9789 0.7535
0.5818 47.0 27730 1.0014 0.7529
0.5743 48.0 28320 0.9380 0.7526
0.592 49.0 28910 0.9494 0.7385
0.5591 50.0 29500 0.9728 0.7523
0.5431 51.0 30090 0.9528 0.7502
0.5537 52.0 30680 0.9995 0.7410
0.5444 53.0 31270 0.9815 0.7538
0.5372 54.0 31860 0.9556 0.7517
0.5491 55.0 32450 0.9824 0.7459
0.5294 56.0 33040 0.9625 0.7391
0.5074 57.0 33630 0.9761 0.7538
0.5127 58.0 34220 1.1065 0.7587
0.5095 59.0 34810 0.9373 0.7434
0.5079 60.0 35400 0.9822 0.7532
0.4886 61.0 35990 1.0654 0.7627
0.5143 62.0 36580 0.9688 0.7520
0.4822 63.0 37170 0.9816 0.7373
0.4956 64.0 37760 0.9746 0.7477
0.4953 65.0 38350 0.9493 0.7544
0.4794 66.0 38940 1.0795 0.7532
0.4794 67.0 39530 0.9915 0.7575
0.48 68.0 40120 0.9385 0.7498
0.4633 69.0 40710 1.0949 0.7526
0.4749 70.0 41300 1.0207 0.7557
0.4657 71.0 41890 0.9383 0.7428
0.465 72.0 42480 1.0948 0.7581
0.4558 73.0 43070 0.9506 0.7492
0.4516 74.0 43660 1.0518 0.7606
0.4577 75.0 44250 1.0124 0.7575
0.4642 76.0 44840 0.9293 0.7526
0.4497 77.0 45430 0.9862 0.7541
0.4614 78.0 46020 0.9403 0.7566
0.4442 79.0 46610 0.9599 0.7581
0.4483 80.0 47200 0.9766 0.7593
0.4223 81.0 47790 0.9297 0.7547
0.4416 82.0 48380 0.9614 0.7587
0.4279 83.0 48970 0.9403 0.7587
0.4159 84.0 49560 1.0827 0.7569
0.4319 85.0 50150 0.9250 0.7505
0.427 86.0 50740 0.9475 0.7517
0.427 87.0 51330 0.9429 0.7523
0.4233 88.0 51920 0.9721 0.7581
0.4167 89.0 52510 0.9387 0.7557
0.4162 90.0 53100 0.9282 0.7544
0.4163 91.0 53690 0.9785 0.7566
0.4214 92.0 54280 0.9217 0.7517
0.4038 93.0 54870 0.9470 0.7584
0.4258 94.0 55460 0.9254 0.7550
0.4206 95.0 56050 0.9380 0.7569
0.4086 96.0 56640 0.9379 0.7578
0.3973 97.0 57230 0.9425 0.7557
0.3971 98.0 57820 0.9461 0.7572
0.3899 99.0 58410 0.9388 0.7557
0.4033 100.0 59000 0.9382 0.7557

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
5
Inference API
This model can be loaded on Inference API (serverless).

Dataset used to train Onutoa/1_7e-3_10_0.5