1_8e-3_10_0.5
This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:
- Loss: 0.9754
- Accuracy: 0.7459
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.008
- train_batch_size: 16
- eval_batch_size: 8
- seed: 11
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 100.0
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy |
---|---|---|---|---|
3.0295 | 1.0 | 590 | 5.2308 | 0.6217 |
3.1648 | 2.0 | 1180 | 2.6673 | 0.3908 |
2.5921 | 3.0 | 1770 | 5.0497 | 0.3761 |
2.9042 | 4.0 | 2360 | 2.2586 | 0.6291 |
2.4411 | 5.0 | 2950 | 6.5105 | 0.6217 |
2.3131 | 6.0 | 3540 | 2.7244 | 0.5183 |
2.0563 | 7.0 | 4130 | 4.6938 | 0.3783 |
1.9468 | 8.0 | 4720 | 1.5045 | 0.6862 |
1.9269 | 9.0 | 5310 | 1.7666 | 0.6734 |
1.9701 | 10.0 | 5900 | 1.8173 | 0.6780 |
1.8231 | 11.0 | 6490 | 1.6929 | 0.6752 |
1.7563 | 12.0 | 7080 | 1.3455 | 0.6862 |
1.726 | 13.0 | 7670 | 1.2870 | 0.6786 |
1.6706 | 14.0 | 8260 | 1.3862 | 0.6951 |
1.5876 | 15.0 | 8850 | 1.4384 | 0.6587 |
1.5067 | 16.0 | 9440 | 1.5336 | 0.6985 |
1.5777 | 17.0 | 10030 | 1.9860 | 0.5972 |
1.4323 | 18.0 | 10620 | 1.2068 | 0.7076 |
1.4228 | 19.0 | 11210 | 1.8071 | 0.6780 |
1.4335 | 20.0 | 11800 | 4.1127 | 0.6346 |
1.4549 | 21.0 | 12390 | 1.2302 | 0.7131 |
1.277 | 22.0 | 12980 | 1.2829 | 0.6771 |
1.2962 | 23.0 | 13570 | 1.2152 | 0.7070 |
1.4076 | 24.0 | 14160 | 1.5758 | 0.6529 |
1.3427 | 25.0 | 14750 | 1.1333 | 0.6997 |
1.1936 | 26.0 | 15340 | 1.1974 | 0.6917 |
1.1937 | 27.0 | 15930 | 1.2653 | 0.6948 |
1.2784 | 28.0 | 16520 | 1.0620 | 0.7242 |
1.1605 | 29.0 | 17110 | 2.7859 | 0.6734 |
1.1438 | 30.0 | 17700 | 1.8633 | 0.6428 |
1.1406 | 31.0 | 18290 | 1.6275 | 0.7098 |
1.0993 | 32.0 | 18880 | 1.2765 | 0.6969 |
1.158 | 33.0 | 19470 | 1.1218 | 0.7058 |
1.0432 | 34.0 | 20060 | 1.0562 | 0.7245 |
1.0295 | 35.0 | 20650 | 1.3146 | 0.7251 |
1.0041 | 36.0 | 21240 | 1.0308 | 0.7150 |
1.0104 | 37.0 | 21830 | 1.0149 | 0.7242 |
1.0096 | 38.0 | 22420 | 1.1232 | 0.7083 |
0.9661 | 39.0 | 23010 | 1.0316 | 0.7251 |
0.9183 | 40.0 | 23600 | 1.2166 | 0.7055 |
0.9298 | 41.0 | 24190 | 1.9118 | 0.7040 |
0.8799 | 42.0 | 24780 | 1.0190 | 0.7306 |
0.954 | 43.0 | 25370 | 1.0761 | 0.7263 |
0.853 | 44.0 | 25960 | 1.2006 | 0.7080 |
1.0647 | 45.0 | 26550 | 1.1605 | 0.7379 |
0.8562 | 46.0 | 27140 | 1.2208 | 0.7122 |
0.8421 | 47.0 | 27730 | 0.9974 | 0.7388 |
0.7865 | 48.0 | 28320 | 1.1207 | 0.7376 |
0.8998 | 49.0 | 28910 | 1.1221 | 0.7080 |
0.8044 | 50.0 | 29500 | 1.0191 | 0.7205 |
0.7771 | 51.0 | 30090 | 0.9921 | 0.7364 |
0.7886 | 52.0 | 30680 | 1.1379 | 0.7419 |
0.7756 | 53.0 | 31270 | 1.3039 | 0.7315 |
0.7232 | 54.0 | 31860 | 1.1143 | 0.7385 |
0.69 | 55.0 | 32450 | 1.1024 | 0.7239 |
0.7313 | 56.0 | 33040 | 1.3560 | 0.7370 |
0.7266 | 57.0 | 33630 | 0.9763 | 0.7431 |
0.7084 | 58.0 | 34220 | 1.4480 | 0.7291 |
0.7072 | 59.0 | 34810 | 1.4463 | 0.7336 |
0.6889 | 60.0 | 35400 | 1.2983 | 0.7330 |
0.6745 | 61.0 | 35990 | 0.9898 | 0.7413 |
0.6739 | 62.0 | 36580 | 0.9817 | 0.7373 |
0.6513 | 63.0 | 37170 | 0.9999 | 0.7391 |
0.6665 | 64.0 | 37760 | 0.9840 | 0.7367 |
0.6428 | 65.0 | 38350 | 1.0120 | 0.7284 |
0.6418 | 66.0 | 38940 | 1.0021 | 0.7401 |
0.6185 | 67.0 | 39530 | 1.0063 | 0.7327 |
0.6259 | 68.0 | 40120 | 1.0108 | 0.7339 |
0.6165 | 69.0 | 40710 | 1.0279 | 0.7440 |
0.6393 | 70.0 | 41300 | 1.1899 | 0.7183 |
0.5869 | 71.0 | 41890 | 0.9767 | 0.7333 |
0.605 | 72.0 | 42480 | 1.4097 | 0.7367 |
0.5906 | 73.0 | 43070 | 1.0036 | 0.7358 |
0.5704 | 74.0 | 43660 | 1.3105 | 0.7443 |
0.5872 | 75.0 | 44250 | 1.0241 | 0.7242 |
0.5755 | 76.0 | 44840 | 1.1519 | 0.7410 |
0.5967 | 77.0 | 45430 | 1.1481 | 0.7431 |
0.57 | 78.0 | 46020 | 1.0164 | 0.7398 |
0.5599 | 79.0 | 46610 | 1.1657 | 0.7391 |
0.5458 | 80.0 | 47200 | 1.1020 | 0.7422 |
0.5299 | 81.0 | 47790 | 1.0836 | 0.7437 |
0.5285 | 82.0 | 48380 | 0.9682 | 0.7391 |
0.538 | 83.0 | 48970 | 1.1895 | 0.7193 |
0.5277 | 84.0 | 49560 | 0.9778 | 0.7459 |
0.525 | 85.0 | 50150 | 0.9893 | 0.7364 |
0.5268 | 86.0 | 50740 | 0.9745 | 0.7434 |
0.518 | 87.0 | 51330 | 0.9654 | 0.7450 |
0.5212 | 88.0 | 51920 | 0.9665 | 0.7382 |
0.5132 | 89.0 | 52510 | 1.0605 | 0.7474 |
0.5155 | 90.0 | 53100 | 0.9605 | 0.7440 |
0.4986 | 91.0 | 53690 | 1.0163 | 0.7480 |
0.5004 | 92.0 | 54280 | 1.0187 | 0.7312 |
0.4846 | 93.0 | 54870 | 0.9721 | 0.7440 |
0.4963 | 94.0 | 55460 | 1.0295 | 0.7468 |
0.4759 | 95.0 | 56050 | 1.0004 | 0.7468 |
0.4905 | 96.0 | 56640 | 1.0361 | 0.7474 |
0.4994 | 97.0 | 57230 | 0.9591 | 0.7446 |
0.4673 | 98.0 | 57820 | 0.9604 | 0.7431 |
0.4734 | 99.0 | 58410 | 0.9771 | 0.7462 |
0.4588 | 100.0 | 59000 | 0.9754 | 0.7459 |
Framework versions
- Transformers 4.30.0
- Pytorch 2.0.1+cu117
- Datasets 2.14.4
- Tokenizers 0.13.3
- Downloads last month
- 7
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.