Edit model card

1_5e-3_5_0.1

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9136
  • Accuracy: 0.7355

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.005
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
1.373 1.0 590 1.2296 0.3786
1.3539 2.0 1180 1.5833 0.6217
1.2392 3.0 1770 1.2689 0.3933
1.1294 4.0 2360 0.9251 0.6272
1.1134 5.0 2950 1.1215 0.6257
1.0908 6.0 3540 0.9055 0.6235
1.0395 7.0 4130 1.6555 0.3881
1.0251 8.0 4720 1.1363 0.6226
0.9902 9.0 5310 1.0739 0.4887
0.9501 10.0 5900 0.8761 0.6208
0.9437 11.0 6490 0.9379 0.6385
0.8883 12.0 7080 0.9268 0.5755
0.9089 13.0 7670 0.8405 0.6343
0.8623 14.0 8260 0.8633 0.6578
0.8454 15.0 8850 0.8616 0.6315
0.8373 16.0 9440 0.7891 0.6709
0.8356 17.0 10030 0.7889 0.6722
0.8187 18.0 10620 0.9561 0.6049
0.7986 19.0 11210 0.7897 0.6786
0.7958 20.0 11800 0.8889 0.6593
0.7742 21.0 12390 0.8762 0.6330
0.7514 22.0 12980 0.7717 0.6933
0.7505 23.0 13570 0.7587 0.6982
0.7105 24.0 14160 0.9016 0.6749
0.7027 25.0 14750 0.8744 0.6483
0.7159 26.0 15340 0.9018 0.6266
0.6908 27.0 15930 0.7527 0.7015
0.6603 28.0 16520 0.7971 0.6997
0.6599 29.0 17110 0.7492 0.7021
0.6621 30.0 17700 0.7845 0.7031
0.6357 31.0 18290 0.7578 0.7119
0.6159 32.0 18880 0.7800 0.7067
0.6181 33.0 19470 0.9263 0.6566
0.5866 34.0 20060 0.8543 0.6771
0.5708 35.0 20650 0.7777 0.7110
0.5784 36.0 21240 0.7719 0.7125
0.5395 37.0 21830 0.7532 0.7116
0.5579 38.0 22420 0.7451 0.7098
0.5113 39.0 23010 0.7618 0.7242
0.5329 40.0 23600 0.9580 0.7135
0.4996 41.0 24190 1.0449 0.6899
0.4889 42.0 24780 0.8325 0.7193
0.4905 43.0 25370 0.9896 0.7089
0.4866 44.0 25960 0.8897 0.6991
0.4652 45.0 26550 0.8080 0.7349
0.441 46.0 27140 0.7911 0.7309
0.45 47.0 27730 0.8294 0.7263
0.4149 48.0 28320 0.8578 0.7162
0.441 49.0 28910 0.8451 0.7284
0.4105 50.0 29500 0.9310 0.7245
0.403 51.0 30090 0.8326 0.7190
0.3872 52.0 30680 0.8510 0.7220
0.3717 53.0 31270 0.8455 0.7321
0.3856 54.0 31860 0.8331 0.7260
0.3808 55.0 32450 0.8245 0.7266
0.3805 56.0 33040 0.8482 0.7303
0.3481 57.0 33630 0.9800 0.6982
0.3549 58.0 34220 0.8415 0.7235
0.3497 59.0 34810 0.8914 0.7223
0.3447 60.0 35400 0.8756 0.7239
0.3398 61.0 35990 0.9337 0.7327
0.3266 62.0 36580 0.9014 0.7333
0.3186 63.0 37170 0.9030 0.7217
0.318 64.0 37760 0.8929 0.7220
0.2978 65.0 38350 0.9019 0.7324
0.3063 66.0 38940 0.8663 0.7232
0.2943 67.0 39530 0.9055 0.7199
0.3013 68.0 40120 0.8958 0.7269
0.2862 69.0 40710 0.9173 0.7287
0.3004 70.0 41300 0.8699 0.7254
0.2917 71.0 41890 0.8956 0.7284
0.2807 72.0 42480 0.9030 0.7321
0.2687 73.0 43070 0.9436 0.7199
0.2771 74.0 43660 0.9673 0.7165
0.2703 75.0 44250 1.0024 0.7373
0.2743 76.0 44840 0.8980 0.7349
0.2587 77.0 45430 0.8994 0.7312
0.2631 78.0 46020 0.9195 0.7339
0.255 79.0 46610 0.8869 0.7361
0.2546 80.0 47200 0.9206 0.7266
0.2412 81.0 47790 0.9025 0.7373
0.2516 82.0 48380 0.9041 0.7358
0.2472 83.0 48970 0.9345 0.7358
0.2455 84.0 49560 0.9110 0.7376
0.2447 85.0 50150 0.9245 0.7306
0.238 86.0 50740 0.9204 0.7391
0.238 87.0 51330 0.9557 0.7364
0.2337 88.0 51920 0.9187 0.7349
0.2313 89.0 52510 0.9249 0.7361
0.2249 90.0 53100 0.9316 0.7422
0.2279 91.0 53690 0.9483 0.7370
0.221 92.0 54280 0.9150 0.7388
0.2291 93.0 54870 0.9243 0.7376
0.2234 94.0 55460 0.9347 0.7398
0.2239 95.0 56050 0.9169 0.7358
0.2191 96.0 56640 0.9255 0.7367
0.2213 97.0 57230 0.9130 0.7321
0.218 98.0 57820 0.9197 0.7388
0.2148 99.0 58410 0.9183 0.7391
0.2197 100.0 59000 0.9136 0.7355

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
2
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train Onutoa/1_5e-3_5_0.1