1_5e-3_5_0.1
This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:
- Loss: 0.9136
- Accuracy: 0.7355
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.005
- train_batch_size: 16
- eval_batch_size: 8
- seed: 11
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 100.0
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy |
---|---|---|---|---|
1.373 | 1.0 | 590 | 1.2296 | 0.3786 |
1.3539 | 2.0 | 1180 | 1.5833 | 0.6217 |
1.2392 | 3.0 | 1770 | 1.2689 | 0.3933 |
1.1294 | 4.0 | 2360 | 0.9251 | 0.6272 |
1.1134 | 5.0 | 2950 | 1.1215 | 0.6257 |
1.0908 | 6.0 | 3540 | 0.9055 | 0.6235 |
1.0395 | 7.0 | 4130 | 1.6555 | 0.3881 |
1.0251 | 8.0 | 4720 | 1.1363 | 0.6226 |
0.9902 | 9.0 | 5310 | 1.0739 | 0.4887 |
0.9501 | 10.0 | 5900 | 0.8761 | 0.6208 |
0.9437 | 11.0 | 6490 | 0.9379 | 0.6385 |
0.8883 | 12.0 | 7080 | 0.9268 | 0.5755 |
0.9089 | 13.0 | 7670 | 0.8405 | 0.6343 |
0.8623 | 14.0 | 8260 | 0.8633 | 0.6578 |
0.8454 | 15.0 | 8850 | 0.8616 | 0.6315 |
0.8373 | 16.0 | 9440 | 0.7891 | 0.6709 |
0.8356 | 17.0 | 10030 | 0.7889 | 0.6722 |
0.8187 | 18.0 | 10620 | 0.9561 | 0.6049 |
0.7986 | 19.0 | 11210 | 0.7897 | 0.6786 |
0.7958 | 20.0 | 11800 | 0.8889 | 0.6593 |
0.7742 | 21.0 | 12390 | 0.8762 | 0.6330 |
0.7514 | 22.0 | 12980 | 0.7717 | 0.6933 |
0.7505 | 23.0 | 13570 | 0.7587 | 0.6982 |
0.7105 | 24.0 | 14160 | 0.9016 | 0.6749 |
0.7027 | 25.0 | 14750 | 0.8744 | 0.6483 |
0.7159 | 26.0 | 15340 | 0.9018 | 0.6266 |
0.6908 | 27.0 | 15930 | 0.7527 | 0.7015 |
0.6603 | 28.0 | 16520 | 0.7971 | 0.6997 |
0.6599 | 29.0 | 17110 | 0.7492 | 0.7021 |
0.6621 | 30.0 | 17700 | 0.7845 | 0.7031 |
0.6357 | 31.0 | 18290 | 0.7578 | 0.7119 |
0.6159 | 32.0 | 18880 | 0.7800 | 0.7067 |
0.6181 | 33.0 | 19470 | 0.9263 | 0.6566 |
0.5866 | 34.0 | 20060 | 0.8543 | 0.6771 |
0.5708 | 35.0 | 20650 | 0.7777 | 0.7110 |
0.5784 | 36.0 | 21240 | 0.7719 | 0.7125 |
0.5395 | 37.0 | 21830 | 0.7532 | 0.7116 |
0.5579 | 38.0 | 22420 | 0.7451 | 0.7098 |
0.5113 | 39.0 | 23010 | 0.7618 | 0.7242 |
0.5329 | 40.0 | 23600 | 0.9580 | 0.7135 |
0.4996 | 41.0 | 24190 | 1.0449 | 0.6899 |
0.4889 | 42.0 | 24780 | 0.8325 | 0.7193 |
0.4905 | 43.0 | 25370 | 0.9896 | 0.7089 |
0.4866 | 44.0 | 25960 | 0.8897 | 0.6991 |
0.4652 | 45.0 | 26550 | 0.8080 | 0.7349 |
0.441 | 46.0 | 27140 | 0.7911 | 0.7309 |
0.45 | 47.0 | 27730 | 0.8294 | 0.7263 |
0.4149 | 48.0 | 28320 | 0.8578 | 0.7162 |
0.441 | 49.0 | 28910 | 0.8451 | 0.7284 |
0.4105 | 50.0 | 29500 | 0.9310 | 0.7245 |
0.403 | 51.0 | 30090 | 0.8326 | 0.7190 |
0.3872 | 52.0 | 30680 | 0.8510 | 0.7220 |
0.3717 | 53.0 | 31270 | 0.8455 | 0.7321 |
0.3856 | 54.0 | 31860 | 0.8331 | 0.7260 |
0.3808 | 55.0 | 32450 | 0.8245 | 0.7266 |
0.3805 | 56.0 | 33040 | 0.8482 | 0.7303 |
0.3481 | 57.0 | 33630 | 0.9800 | 0.6982 |
0.3549 | 58.0 | 34220 | 0.8415 | 0.7235 |
0.3497 | 59.0 | 34810 | 0.8914 | 0.7223 |
0.3447 | 60.0 | 35400 | 0.8756 | 0.7239 |
0.3398 | 61.0 | 35990 | 0.9337 | 0.7327 |
0.3266 | 62.0 | 36580 | 0.9014 | 0.7333 |
0.3186 | 63.0 | 37170 | 0.9030 | 0.7217 |
0.318 | 64.0 | 37760 | 0.8929 | 0.7220 |
0.2978 | 65.0 | 38350 | 0.9019 | 0.7324 |
0.3063 | 66.0 | 38940 | 0.8663 | 0.7232 |
0.2943 | 67.0 | 39530 | 0.9055 | 0.7199 |
0.3013 | 68.0 | 40120 | 0.8958 | 0.7269 |
0.2862 | 69.0 | 40710 | 0.9173 | 0.7287 |
0.3004 | 70.0 | 41300 | 0.8699 | 0.7254 |
0.2917 | 71.0 | 41890 | 0.8956 | 0.7284 |
0.2807 | 72.0 | 42480 | 0.9030 | 0.7321 |
0.2687 | 73.0 | 43070 | 0.9436 | 0.7199 |
0.2771 | 74.0 | 43660 | 0.9673 | 0.7165 |
0.2703 | 75.0 | 44250 | 1.0024 | 0.7373 |
0.2743 | 76.0 | 44840 | 0.8980 | 0.7349 |
0.2587 | 77.0 | 45430 | 0.8994 | 0.7312 |
0.2631 | 78.0 | 46020 | 0.9195 | 0.7339 |
0.255 | 79.0 | 46610 | 0.8869 | 0.7361 |
0.2546 | 80.0 | 47200 | 0.9206 | 0.7266 |
0.2412 | 81.0 | 47790 | 0.9025 | 0.7373 |
0.2516 | 82.0 | 48380 | 0.9041 | 0.7358 |
0.2472 | 83.0 | 48970 | 0.9345 | 0.7358 |
0.2455 | 84.0 | 49560 | 0.9110 | 0.7376 |
0.2447 | 85.0 | 50150 | 0.9245 | 0.7306 |
0.238 | 86.0 | 50740 | 0.9204 | 0.7391 |
0.238 | 87.0 | 51330 | 0.9557 | 0.7364 |
0.2337 | 88.0 | 51920 | 0.9187 | 0.7349 |
0.2313 | 89.0 | 52510 | 0.9249 | 0.7361 |
0.2249 | 90.0 | 53100 | 0.9316 | 0.7422 |
0.2279 | 91.0 | 53690 | 0.9483 | 0.7370 |
0.221 | 92.0 | 54280 | 0.9150 | 0.7388 |
0.2291 | 93.0 | 54870 | 0.9243 | 0.7376 |
0.2234 | 94.0 | 55460 | 0.9347 | 0.7398 |
0.2239 | 95.0 | 56050 | 0.9169 | 0.7358 |
0.2191 | 96.0 | 56640 | 0.9255 | 0.7367 |
0.2213 | 97.0 | 57230 | 0.9130 | 0.7321 |
0.218 | 98.0 | 57820 | 0.9197 | 0.7388 |
0.2148 | 99.0 | 58410 | 0.9183 | 0.7391 |
0.2197 | 100.0 | 59000 | 0.9136 | 0.7355 |
Framework versions
- Transformers 4.30.0
- Pytorch 2.0.1+cu117
- Datasets 2.14.4
- Tokenizers 0.13.3
- Downloads last month
- 2
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.