Edit model card

1_5e-3_5_0.5

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9516
  • Accuracy: 0.7450

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.005
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
2.4372 1.0 590 1.8593 0.6177
2.3953 2.0 1180 3.6910 0.3786
2.3694 3.0 1770 2.1033 0.4694
2.0494 4.0 2360 1.7694 0.6006
2.034 5.0 2950 1.7949 0.6355
1.8146 6.0 3540 1.7374 0.6159
1.896 7.0 4130 1.8850 0.5624
1.7794 8.0 4720 2.8405 0.6245
1.8298 9.0 5310 2.6985 0.4349
1.7892 10.0 5900 2.2049 0.6352
1.6916 11.0 6490 1.6606 0.6272
1.6384 12.0 7080 1.5955 0.6394
1.6382 13.0 7670 1.6722 0.6596
1.6078 14.0 8260 1.4874 0.6587
1.5373 15.0 8850 1.4382 0.6642
1.4655 16.0 9440 1.4120 0.6700
1.4354 17.0 10030 2.0067 0.6532
1.4021 18.0 10620 1.7860 0.5875
1.3537 19.0 11210 1.4043 0.6853
1.3638 20.0 11800 1.3726 0.6875
1.3061 21.0 12390 1.3332 0.6740
1.3052 22.0 12980 1.2831 0.6939
1.4056 23.0 13570 1.4235 0.6835
1.3389 24.0 14160 1.5395 0.6817
1.2294 25.0 14750 1.2364 0.6994
1.2213 26.0 15340 1.1806 0.7012
1.203 27.0 15930 1.3771 0.6538
1.1667 28.0 16520 1.3193 0.6820
1.1516 29.0 17110 1.3490 0.6621
1.1657 30.0 17700 1.1866 0.7015
1.1212 31.0 18290 1.2403 0.6991
1.0632 32.0 18880 1.1608 0.7138
1.0702 33.0 19470 1.3606 0.6642
1.0609 34.0 20060 1.1448 0.6972
1.0407 35.0 20650 1.2761 0.6838
1.0151 36.0 21240 2.0245 0.6862
1.0246 37.0 21830 1.0999 0.7012
0.9971 38.0 22420 1.1661 0.6997
0.9732 39.0 23010 1.1978 0.7187
0.9642 40.0 23600 1.0760 0.7245
0.9628 41.0 24190 1.2119 0.7223
0.9605 42.0 24780 1.0589 0.7245
0.9297 43.0 25370 1.0496 0.7297
0.9282 44.0 25960 1.0384 0.7324
0.8927 45.0 26550 1.0954 0.7284
0.8753 46.0 27140 1.0344 0.7343
0.8787 47.0 27730 1.0238 0.7162
0.8397 48.0 28320 1.0650 0.7162
0.9109 49.0 28910 1.0901 0.7297
0.8609 50.0 29500 1.0152 0.7300
0.823 51.0 30090 1.1109 0.7128
0.8029 52.0 30680 1.0899 0.7113
0.8142 53.0 31270 1.0185 0.7339
0.7967 54.0 31860 0.9917 0.7336
0.7919 55.0 32450 1.0096 0.7352
0.7883 56.0 33040 1.0033 0.7355
0.7794 57.0 33630 1.0478 0.7336
0.7444 58.0 34220 1.0485 0.7284
0.7646 59.0 34810 1.0046 0.7242
0.7493 60.0 35400 0.9997 0.7300
0.7126 61.0 35990 0.9838 0.7398
0.7303 62.0 36580 0.9983 0.7300
0.7184 63.0 37170 1.1151 0.7156
0.711 64.0 37760 1.0758 0.7220
0.6963 65.0 38350 0.9884 0.7281
0.6972 66.0 38940 0.9688 0.7336
0.6927 67.0 39530 0.9794 0.7339
0.6923 68.0 40120 0.9681 0.7379
0.6829 69.0 40710 1.0167 0.7440
0.6705 70.0 41300 0.9709 0.7358
0.6717 71.0 41890 1.0276 0.7226
0.6683 72.0 42480 0.9858 0.7324
0.6405 73.0 43070 0.9954 0.7336
0.6423 74.0 43660 0.9730 0.7339
0.6628 75.0 44250 1.0100 0.7388
0.6528 76.0 44840 0.9663 0.7398
0.6327 77.0 45430 0.9619 0.7358
0.6434 78.0 46020 0.9671 0.7361
0.6261 79.0 46610 0.9778 0.7248
0.6312 80.0 47200 0.9802 0.7343
0.6098 81.0 47790 0.9736 0.7431
0.6221 82.0 48380 0.9820 0.7330
0.6166 83.0 48970 0.9587 0.7431
0.6072 84.0 49560 0.9671 0.7370
0.5986 85.0 50150 0.9629 0.7385
0.5959 86.0 50740 0.9576 0.7407
0.5858 87.0 51330 0.9793 0.7428
0.5846 88.0 51920 0.9722 0.7404
0.5879 89.0 52510 0.9822 0.7394
0.582 90.0 53100 0.9625 0.7422
0.5805 91.0 53690 0.9856 0.7443
0.5767 92.0 54280 0.9560 0.7404
0.5711 93.0 54870 0.9629 0.7440
0.5769 94.0 55460 0.9560 0.7431
0.557 95.0 56050 0.9562 0.7434
0.5706 96.0 56640 0.9565 0.7440
0.5691 97.0 57230 0.9515 0.7425
0.5496 98.0 57820 0.9570 0.7410
0.5643 99.0 58410 0.9512 0.7434
0.5539 100.0 59000 0.9516 0.7450

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
11

Dataset used to train Onutoa/1_5e-3_5_0.5