Edit model card

1_9e-3_5_0.5

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8603
  • Accuracy: 0.7489

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.009
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
2.7616 1.0 590 2.7583 0.3798
2.2507 2.0 1180 1.8432 0.6294
2.5953 3.0 1770 3.4928 0.4532
2.3305 4.0 2360 1.5737 0.6486
1.9577 5.0 2950 2.6604 0.6263
1.7557 6.0 3540 1.2734 0.6761
1.6227 7.0 4130 3.4140 0.5119
1.4961 8.0 4720 1.2029 0.7043
1.3331 9.0 5310 1.2170 0.7092
1.3007 10.0 5900 1.7625 0.6725
1.2049 11.0 6490 1.0667 0.7070
1.1087 12.0 7080 0.9915 0.7156
1.1023 13.0 7670 1.0683 0.6924
1.0404 14.0 8260 1.1711 0.7248
1.0287 15.0 8850 1.0966 0.7297
0.9405 16.0 9440 0.9352 0.7107
0.8558 17.0 10030 0.9269 0.7205
0.8273 18.0 10620 0.9574 0.7235
0.7798 19.0 11210 0.9598 0.7385
0.7646 20.0 11800 0.9004 0.7287
0.7505 21.0 12390 0.9389 0.7174
0.7273 22.0 12980 0.9234 0.7358
0.6971 23.0 13570 0.9055 0.7315
0.6815 24.0 14160 0.8711 0.7352
0.6729 25.0 14750 1.0923 0.7437
0.6151 26.0 15340 0.8950 0.7254
0.6291 27.0 15930 1.1086 0.6945
0.6243 28.0 16520 0.9179 0.7410
0.609 29.0 17110 1.0778 0.7410
0.5733 30.0 17700 0.9548 0.7422
0.5742 31.0 18290 1.1436 0.7413
0.5675 32.0 18880 0.8956 0.7450
0.5578 33.0 19470 0.9040 0.7382
0.5339 34.0 20060 0.8730 0.7453
0.5284 35.0 20650 1.0258 0.7486
0.5116 36.0 21240 1.2775 0.7382
0.5215 37.0 21830 0.9275 0.7477
0.5038 38.0 22420 0.8780 0.7394
0.5073 39.0 23010 0.9095 0.7468
0.4897 40.0 23600 0.8864 0.7410
0.4927 41.0 24190 1.1312 0.7391
0.4941 42.0 24780 0.8809 0.7339
0.4629 43.0 25370 1.1564 0.7419
0.4754 44.0 25960 0.9223 0.7413
0.457 45.0 26550 0.8677 0.7422
0.4398 46.0 27140 1.0571 0.7471
0.4612 47.0 27730 0.8773 0.7401
0.4464 48.0 28320 0.9260 0.7477
0.4779 49.0 28910 0.8712 0.7425
0.443 50.0 29500 0.8886 0.7413
0.4445 51.0 30090 0.8968 0.7431
0.4274 52.0 30680 0.9516 0.7495
0.4239 53.0 31270 0.8773 0.7443
0.4143 54.0 31860 1.0295 0.7401
0.4359 55.0 32450 0.8879 0.7453
0.4197 56.0 33040 0.8712 0.7489
0.397 57.0 33630 1.0037 0.7544
0.402 58.0 34220 0.8789 0.7554
0.4015 59.0 34810 0.8532 0.7523
0.4008 60.0 35400 0.8840 0.7523
0.3943 61.0 35990 0.9475 0.7462
0.3968 62.0 36580 0.9413 0.7465
0.394 63.0 37170 0.8878 0.7480
0.3914 64.0 37760 0.8737 0.7511
0.3959 65.0 38350 0.8553 0.7486
0.3881 66.0 38940 0.8905 0.7495
0.379 67.0 39530 0.8956 0.7489
0.3821 68.0 40120 0.8711 0.7514
0.3764 69.0 40710 0.9552 0.7557
0.3841 70.0 41300 0.9638 0.7523
0.3758 71.0 41890 0.8728 0.7453
0.376 72.0 42480 0.9654 0.7450
0.364 73.0 43070 1.0121 0.7477
0.3567 74.0 43660 1.0070 0.7508
0.3723 75.0 44250 0.9271 0.7508
0.3673 76.0 44840 0.8824 0.7450
0.3656 77.0 45430 0.8812 0.7477
0.3722 78.0 46020 0.8728 0.7502
0.3719 79.0 46610 0.8551 0.7465
0.3502 80.0 47200 0.8913 0.7523
0.3467 81.0 47790 0.8476 0.7489
0.348 82.0 48380 0.8885 0.7517
0.3498 83.0 48970 0.8690 0.7443
0.3457 84.0 49560 0.8824 0.7480
0.3463 85.0 50150 0.8450 0.7453
0.3465 86.0 50740 0.8760 0.7459
0.3418 87.0 51330 0.8702 0.7437
0.3394 88.0 51920 0.8782 0.7434
0.3371 89.0 52510 0.8950 0.7474
0.3309 90.0 53100 0.8568 0.7398
0.3321 91.0 53690 0.8973 0.7495
0.3385 92.0 54280 0.8401 0.7431
0.3264 93.0 54870 0.8658 0.7462
0.3382 94.0 55460 0.8652 0.7483
0.3279 95.0 56050 0.8785 0.7465
0.3274 96.0 56640 0.8666 0.7477
0.3272 97.0 57230 0.8666 0.7489
0.3147 98.0 57820 0.8641 0.7498
0.3172 99.0 58410 0.8616 0.7486
0.3256 100.0 59000 0.8603 0.7489

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
10

Dataset used to train Onutoa/1_9e-3_5_0.5