Edit model card

1_5e-3_1_0.5

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5068
  • Accuracy: 0.7388

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.005
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
0.8736 1.0 590 1.0074 0.6217
0.8968 2.0 1180 1.0334 0.6217
0.8293 3.0 1770 0.6363 0.4920
0.7568 4.0 2360 0.6064 0.6232
0.66 5.0 2950 0.6124 0.6223
0.6953 6.0 3540 0.5216 0.6550
0.6411 7.0 4130 0.5622 0.6012
0.5966 8.0 4720 0.4958 0.6584
0.5765 9.0 5310 0.8209 0.6300
0.6133 10.0 5900 0.4712 0.6826
0.605 11.0 6490 0.4679 0.7034
0.5325 12.0 7080 0.7704 0.6443
0.5728 13.0 7670 0.5719 0.6024
0.5194 14.0 8260 0.8197 0.6535
0.501 15.0 8850 0.4650 0.6758
0.5197 16.0 9440 0.4482 0.6908
0.4824 17.0 10030 0.5545 0.6208
0.4937 18.0 10620 0.8156 0.5514
0.4855 19.0 11210 0.4380 0.7061
0.4705 20.0 11800 0.4712 0.7055
0.4481 21.0 12390 0.4595 0.7098
0.4624 22.0 12980 0.5374 0.6532
0.4222 23.0 13570 0.4828 0.6731
0.4293 24.0 14160 0.4509 0.7147
0.4082 25.0 14750 0.4616 0.7018
0.392 26.0 15340 0.4615 0.7061
0.4079 27.0 15930 0.4404 0.7278
0.3798 28.0 16520 0.5590 0.6691
0.4075 29.0 17110 0.5303 0.7122
0.3755 30.0 17700 0.4535 0.7312
0.3686 31.0 18290 0.5050 0.6771
0.3553 32.0 18880 0.4831 0.7269
0.3576 33.0 19470 0.4556 0.7177
0.343 34.0 20060 0.4762 0.7269
0.3275 35.0 20650 0.4346 0.7275
0.327 36.0 21240 0.4859 0.7269
0.3328 37.0 21830 0.4580 0.7080
0.3228 38.0 22420 0.4488 0.7266
0.3103 39.0 23010 0.4543 0.7379
0.2946 40.0 23600 0.4612 0.7379
0.3044 41.0 24190 0.5015 0.7352
0.3008 42.0 24780 0.4525 0.7281
0.2823 43.0 25370 0.5095 0.7278
0.2779 44.0 25960 0.4926 0.7095
0.2763 45.0 26550 0.4621 0.7343
0.2726 46.0 27140 0.4941 0.7343
0.2714 47.0 27730 0.4843 0.7187
0.2637 48.0 28320 0.5355 0.7336
0.2699 49.0 28910 0.4733 0.7355
0.2579 50.0 29500 0.4887 0.7187
0.2416 51.0 30090 0.4815 0.7211
0.248 52.0 30680 0.4938 0.7287
0.2424 53.0 31270 0.5618 0.6960
0.2333 54.0 31860 0.4903 0.7333
0.2392 55.0 32450 0.5097 0.7343
0.2481 56.0 33040 0.5276 0.7352
0.2291 57.0 33630 0.4934 0.7327
0.2181 58.0 34220 0.5084 0.7294
0.227 59.0 34810 0.5020 0.7266
0.2242 60.0 35400 0.5140 0.7315
0.2243 61.0 35990 0.5246 0.7297
0.2218 62.0 36580 0.4869 0.7275
0.2078 63.0 37170 0.4971 0.7187
0.2194 64.0 37760 0.5192 0.7251
0.2078 65.0 38350 0.5858 0.7410
0.2079 66.0 38940 0.5299 0.7361
0.2019 67.0 39530 0.4952 0.7306
0.2076 68.0 40120 0.5006 0.7324
0.2013 69.0 40710 0.5055 0.7343
0.2047 70.0 41300 0.5223 0.7336
0.2049 71.0 41890 0.5265 0.7162
0.1916 72.0 42480 0.5238 0.7407
0.1896 73.0 43070 0.4899 0.7361
0.19 74.0 43660 0.5060 0.7315
0.1918 75.0 44250 0.5260 0.7346
0.1877 76.0 44840 0.5053 0.7336
0.1952 77.0 45430 0.5019 0.7382
0.1851 78.0 46020 0.4942 0.7336
0.1862 79.0 46610 0.5213 0.7398
0.1833 80.0 47200 0.5167 0.7343
0.181 81.0 47790 0.5394 0.7358
0.186 82.0 48380 0.5684 0.7336
0.1825 83.0 48970 0.5106 0.7373
0.1713 84.0 49560 0.5482 0.7410
0.174 85.0 50150 0.5182 0.7385
0.1712 86.0 50740 0.5350 0.7376
0.1687 87.0 51330 0.5074 0.7391
0.172 88.0 51920 0.5126 0.7382
0.1702 89.0 52510 0.4916 0.7275
0.1695 90.0 53100 0.5229 0.7370
0.1705 91.0 53690 0.4987 0.7401
0.1703 92.0 54280 0.4968 0.7254
0.1696 93.0 54870 0.5109 0.7382
0.1651 94.0 55460 0.5180 0.7413
0.1623 95.0 56050 0.5017 0.7385
0.1659 96.0 56640 0.5077 0.7407
0.1592 97.0 57230 0.5173 0.7394
0.1608 98.0 57820 0.5034 0.7413
0.1599 99.0 58410 0.5079 0.7407
0.1638 100.0 59000 0.5068 0.7388

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
4
Inference API
This model can be loaded on Inference API (serverless).

Dataset used to train Onutoa/1_5e-3_1_0.5