Edit model card

1_5e-3_1_0.1

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9257
  • Accuracy: 0.7330

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.005
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
1.0669 1.0 590 0.6986 0.6217
0.9963 2.0 1180 1.8702 0.3792
1.0427 3.0 1770 2.2910 0.3798
0.8982 4.0 2360 0.7642 0.4159
0.9871 5.0 2950 0.9999 0.6217
0.8853 6.0 3540 0.6842 0.5278
0.8006 7.0 4130 1.4763 0.3878
0.7667 8.0 4720 0.8226 0.6239
0.7472 9.0 5310 0.7288 0.6364
0.7638 10.0 5900 0.5834 0.6636
0.7755 11.0 6490 1.6914 0.4269
0.6952 12.0 7080 0.9552 0.6324
0.7343 13.0 7670 0.5715 0.6835
0.6358 14.0 8260 1.0425 0.6284
0.6214 15.0 8850 0.6728 0.6807
0.7714 16.0 9440 0.5675 0.6991
0.6478 17.0 10030 0.6009 0.6976
0.6253 18.0 10620 0.5959 0.6942
0.5884 19.0 11210 0.6113 0.6896
0.6143 20.0 11800 0.5812 0.7165
0.5621 21.0 12390 0.5986 0.7125
0.561 22.0 12980 0.9897 0.5994
0.5203 23.0 13570 0.8431 0.6606
0.5278 24.0 14160 1.2396 0.5673
0.5013 25.0 14750 0.6779 0.6850
0.5121 26.0 15340 0.8150 0.6459
0.4987 27.0 15930 0.6473 0.7208
0.4915 28.0 16520 0.6165 0.6997
0.4362 29.0 17110 0.7189 0.6587
0.4401 30.0 17700 0.6948 0.7211
0.4488 31.0 18290 0.9311 0.6924
0.4593 32.0 18880 0.6527 0.7297
0.4209 33.0 19470 1.0135 0.6437
0.3953 34.0 20060 0.8262 0.7162
0.3813 35.0 20650 0.8390 0.6911
0.3916 36.0 21240 0.7626 0.7
0.3736 37.0 21830 0.6349 0.7199
0.3558 38.0 22420 0.6932 0.7284
0.378 39.0 23010 0.9384 0.6706
0.3104 40.0 23600 0.8561 0.7269
0.3366 41.0 24190 0.7296 0.7110
0.3089 42.0 24780 0.7695 0.7183
0.3099 43.0 25370 0.9426 0.6933
0.3225 44.0 25960 0.8238 0.7330
0.2853 45.0 26550 0.7910 0.7346
0.3031 46.0 27140 1.0613 0.6713
0.2865 47.0 27730 0.8105 0.7263
0.2736 48.0 28320 0.9241 0.7119
0.2892 49.0 28910 0.8532 0.7281
0.2582 50.0 29500 0.8393 0.7214
0.2631 51.0 30090 1.1566 0.6722
0.2496 52.0 30680 0.9162 0.6911
0.2501 53.0 31270 0.8305 0.7251
0.2362 54.0 31860 1.1556 0.6599
0.2325 55.0 32450 1.0032 0.6685
0.2539 56.0 33040 0.9128 0.7336
0.2231 57.0 33630 0.8328 0.7073
0.2123 58.0 34220 0.9290 0.7171
0.2093 59.0 34810 0.8650 0.7229
0.2151 60.0 35400 0.9212 0.7245
0.2074 61.0 35990 0.8884 0.7257
0.2072 62.0 36580 0.8822 0.7251
0.1898 63.0 37170 0.9609 0.7287
0.1936 64.0 37760 0.9800 0.6979
0.197 65.0 38350 1.0263 0.7125
0.1856 66.0 38940 0.9902 0.7404
0.1751 67.0 39530 0.8972 0.7312
0.1791 68.0 40120 1.0031 0.7248
0.1693 69.0 40710 1.0957 0.7361
0.1783 70.0 41300 1.0342 0.7349
0.1801 71.0 41890 1.0411 0.7067
0.1768 72.0 42480 0.9629 0.7211
0.1595 73.0 43070 0.9862 0.7370
0.154 74.0 43660 0.9240 0.7333
0.1578 75.0 44250 1.1158 0.7336
0.165 76.0 44840 0.9100 0.7358
0.1582 77.0 45430 0.9886 0.7324
0.1573 78.0 46020 1.0058 0.7193
0.1544 79.0 46610 0.9316 0.7199
0.1488 80.0 47200 0.9493 0.7196
0.141 81.0 47790 0.9467 0.7352
0.1479 82.0 48380 0.8841 0.7232
0.1377 83.0 48970 0.9072 0.7309
0.1372 84.0 49560 0.9831 0.7266
0.1389 85.0 50150 0.9714 0.7272
0.136 86.0 50740 0.9617 0.7364
0.1383 87.0 51330 0.9970 0.7257
0.1324 88.0 51920 0.8863 0.7190
0.1262 89.0 52510 0.9828 0.7336
0.132 90.0 53100 0.9576 0.7333
0.129 91.0 53690 0.9326 0.7321
0.1241 92.0 54280 0.9571 0.7278
0.1217 93.0 54870 0.9131 0.7306
0.1253 94.0 55460 0.9053 0.7315
0.1192 95.0 56050 0.9126 0.7349
0.1225 96.0 56640 0.9336 0.7355
0.1229 97.0 57230 0.9702 0.7272
0.1165 98.0 57820 0.9494 0.7339
0.1198 99.0 58410 0.9183 0.7324
0.1172 100.0 59000 0.9257 0.7330

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
3
Inference API
This model can be loaded on Inference API (serverless).

Dataset used to train Onutoa/1_5e-3_1_0.1