1_6e-3_10_0.5
This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:
- Loss: 0.9536
- Accuracy: 0.7596
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.006
- train_batch_size: 16
- eval_batch_size: 8
- seed: 11
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 100.0
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy |
---|---|---|---|---|
2.948 | 1.0 | 590 | 2.2396 | 0.6214 |
2.5635 | 2.0 | 1180 | 2.2693 | 0.6275 |
2.5246 | 3.0 | 1770 | 1.9556 | 0.6141 |
2.329 | 4.0 | 2360 | 2.3951 | 0.4801 |
2.1726 | 5.0 | 2950 | 1.7234 | 0.6618 |
2.0265 | 6.0 | 3540 | 1.5347 | 0.6679 |
2.0227 | 7.0 | 4130 | 1.8508 | 0.6064 |
1.8725 | 8.0 | 4720 | 2.0863 | 0.6584 |
1.8575 | 9.0 | 5310 | 4.0052 | 0.4639 |
1.8071 | 10.0 | 5900 | 3.1552 | 0.6468 |
1.6655 | 11.0 | 6490 | 1.3147 | 0.7104 |
1.501 | 12.0 | 7080 | 1.3005 | 0.6844 |
1.538 | 13.0 | 7670 | 1.7051 | 0.6948 |
1.4114 | 14.0 | 8260 | 1.4922 | 0.7028 |
1.3916 | 15.0 | 8850 | 1.6514 | 0.7034 |
1.3373 | 16.0 | 9440 | 1.9420 | 0.5896 |
1.271 | 17.0 | 10030 | 2.9731 | 0.6624 |
1.3123 | 18.0 | 10620 | 1.4756 | 0.6609 |
1.2775 | 19.0 | 11210 | 1.4888 | 0.6612 |
1.2341 | 20.0 | 11800 | 1.4493 | 0.7159 |
1.1907 | 21.0 | 12390 | 1.7638 | 0.7110 |
1.2035 | 22.0 | 12980 | 1.0716 | 0.7291 |
1.0365 | 23.0 | 13570 | 1.2975 | 0.6853 |
1.1041 | 24.0 | 14160 | 1.0275 | 0.7220 |
1.1326 | 25.0 | 14750 | 1.0228 | 0.7385 |
1.0261 | 26.0 | 15340 | 1.1473 | 0.7076 |
1.0168 | 27.0 | 15930 | 1.0435 | 0.7205 |
1.0653 | 28.0 | 16520 | 1.0105 | 0.7358 |
0.9418 | 29.0 | 17110 | 1.0397 | 0.7232 |
1.0591 | 30.0 | 17700 | 1.3640 | 0.6917 |
0.9186 | 31.0 | 18290 | 0.9679 | 0.7459 |
0.8665 | 32.0 | 18880 | 1.0310 | 0.7303 |
0.9005 | 33.0 | 19470 | 1.0498 | 0.7235 |
0.8494 | 34.0 | 20060 | 0.9766 | 0.7358 |
0.8474 | 35.0 | 20650 | 1.0077 | 0.7465 |
0.7973 | 36.0 | 21240 | 1.0674 | 0.7428 |
0.8049 | 37.0 | 21830 | 1.0074 | 0.7398 |
0.8241 | 38.0 | 22420 | 0.9613 | 0.7453 |
0.7793 | 39.0 | 23010 | 0.9864 | 0.7398 |
0.7781 | 40.0 | 23600 | 1.0741 | 0.7456 |
0.7539 | 41.0 | 24190 | 0.9809 | 0.7550 |
0.7403 | 42.0 | 24780 | 0.9993 | 0.7339 |
0.7494 | 43.0 | 25370 | 0.9887 | 0.7477 |
0.7091 | 44.0 | 25960 | 1.1792 | 0.7125 |
0.7236 | 45.0 | 26550 | 0.9549 | 0.7443 |
0.6947 | 46.0 | 27140 | 1.3568 | 0.7440 |
0.6928 | 47.0 | 27730 | 1.0682 | 0.7517 |
0.6578 | 48.0 | 28320 | 1.0993 | 0.7486 |
0.7723 | 49.0 | 28910 | 1.0381 | 0.7260 |
0.7169 | 50.0 | 29500 | 0.9510 | 0.7486 |
0.6424 | 51.0 | 30090 | 1.0781 | 0.7281 |
0.6652 | 52.0 | 30680 | 0.9623 | 0.7541 |
0.6274 | 53.0 | 31270 | 0.9476 | 0.7498 |
0.6295 | 54.0 | 31860 | 0.9461 | 0.7474 |
0.6252 | 55.0 | 32450 | 1.0873 | 0.7278 |
0.632 | 56.0 | 33040 | 0.9470 | 0.7492 |
0.5865 | 57.0 | 33630 | 1.4737 | 0.7355 |
0.6029 | 58.0 | 34220 | 1.0871 | 0.7477 |
0.5935 | 59.0 | 34810 | 1.0781 | 0.7514 |
0.6023 | 60.0 | 35400 | 0.9968 | 0.7581 |
0.5849 | 61.0 | 35990 | 1.0700 | 0.7547 |
0.5813 | 62.0 | 36580 | 1.2525 | 0.7425 |
0.5557 | 63.0 | 37170 | 0.9643 | 0.7541 |
0.541 | 64.0 | 37760 | 1.0179 | 0.7547 |
0.5693 | 65.0 | 38350 | 1.0064 | 0.7401 |
0.5562 | 66.0 | 38940 | 1.2333 | 0.7367 |
0.5677 | 67.0 | 39530 | 0.9976 | 0.7388 |
0.5357 | 68.0 | 40120 | 0.9795 | 0.7413 |
0.5372 | 69.0 | 40710 | 1.1113 | 0.7462 |
0.5563 | 70.0 | 41300 | 1.1366 | 0.7492 |
0.5377 | 71.0 | 41890 | 0.9343 | 0.7502 |
0.5442 | 72.0 | 42480 | 1.1735 | 0.7465 |
0.5124 | 73.0 | 43070 | 0.9499 | 0.7514 |
0.5007 | 74.0 | 43660 | 1.2104 | 0.7456 |
0.5094 | 75.0 | 44250 | 0.9865 | 0.7474 |
0.5118 | 76.0 | 44840 | 1.0542 | 0.7474 |
0.5166 | 77.0 | 45430 | 0.9762 | 0.7615 |
0.5071 | 78.0 | 46020 | 0.9333 | 0.7581 |
0.4961 | 79.0 | 46610 | 1.0310 | 0.7535 |
0.4863 | 80.0 | 47200 | 1.0242 | 0.7492 |
0.4801 | 81.0 | 47790 | 1.0528 | 0.7535 |
0.4975 | 82.0 | 48380 | 1.0188 | 0.7554 |
0.4868 | 83.0 | 48970 | 0.9455 | 0.7596 |
0.4661 | 84.0 | 49560 | 0.9841 | 0.7557 |
0.4765 | 85.0 | 50150 | 0.9570 | 0.7538 |
0.4732 | 86.0 | 50740 | 1.0383 | 0.7535 |
0.4846 | 87.0 | 51330 | 0.9560 | 0.7587 |
0.4641 | 88.0 | 51920 | 0.9716 | 0.7578 |
0.477 | 89.0 | 52510 | 0.9581 | 0.7606 |
0.4567 | 90.0 | 53100 | 0.9674 | 0.7569 |
0.4567 | 91.0 | 53690 | 0.9718 | 0.7587 |
0.4676 | 92.0 | 54280 | 0.9535 | 0.7520 |
0.4532 | 93.0 | 54870 | 0.9593 | 0.7563 |
0.4727 | 94.0 | 55460 | 0.9611 | 0.7584 |
0.4535 | 95.0 | 56050 | 0.9539 | 0.7602 |
0.4569 | 96.0 | 56640 | 0.9506 | 0.7587 |
0.4417 | 97.0 | 57230 | 0.9616 | 0.7584 |
0.4314 | 98.0 | 57820 | 0.9488 | 0.7593 |
0.4318 | 99.0 | 58410 | 0.9439 | 0.7587 |
0.4415 | 100.0 | 59000 | 0.9536 | 0.7596 |
Framework versions
- Transformers 4.30.0
- Pytorch 2.0.1+cu117
- Datasets 2.14.4
- Tokenizers 0.13.3
- Downloads last month
- 13