Edit model card

1_6e-3_1_0.1

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9552
  • Accuracy: 0.7294

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.006
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
1.1794 1.0 590 0.8903 0.6217
1.096 2.0 1180 0.6682 0.5771
0.8877 3.0 1770 1.0585 0.3792
0.8825 4.0 2360 0.6340 0.6229
0.891 5.0 2950 0.8424 0.6217
0.7749 6.0 3540 0.6586 0.5752
0.8351 7.0 4130 0.6083 0.6373
0.7693 8.0 4720 0.6969 0.5813
0.869 9.0 5310 0.5918 0.6777
0.7739 10.0 5900 0.6373 0.6416
0.741 11.0 6490 0.7306 0.6306
0.6366 12.0 7080 0.6535 0.6951
0.6503 13.0 7670 0.5655 0.7021
0.7297 14.0 8260 0.8470 0.5847
0.5637 15.0 8850 0.6914 0.6278
0.6233 16.0 9440 0.7041 0.6862
0.5812 17.0 10030 0.6282 0.7049
0.5423 18.0 10620 1.1433 0.5612
0.5366 19.0 11210 0.6643 0.7168
0.5369 20.0 11800 0.9787 0.6832
0.4828 21.0 12390 0.8036 0.7049
0.5085 22.0 12980 0.8132 0.6425
0.4488 23.0 13570 0.7755 0.6651
0.4184 24.0 14160 0.6817 0.7104
0.448 25.0 14750 0.6490 0.7193
0.4123 26.0 15340 0.7854 0.6728
0.4196 27.0 15930 0.7012 0.7138
0.4119 28.0 16520 0.7525 0.7116
0.3811 29.0 17110 0.7333 0.7012
0.3698 30.0 17700 1.1169 0.6480
0.3382 31.0 18290 0.6635 0.7232
0.338 32.0 18880 0.7444 0.7266
0.3359 33.0 19470 1.0398 0.6621
0.3071 34.0 20060 0.8387 0.7291
0.3001 35.0 20650 0.7648 0.7281
0.3221 36.0 21240 0.7485 0.7266
0.2973 37.0 21830 0.7841 0.7260
0.2801 38.0 22420 0.8797 0.7242
0.2666 39.0 23010 0.9504 0.7028
0.2575 40.0 23600 0.8444 0.7217
0.2796 41.0 24190 1.1635 0.7067
0.2596 42.0 24780 0.8979 0.7217
0.2465 43.0 25370 0.8439 0.7177
0.2475 44.0 25960 0.9628 0.7028
0.2394 45.0 26550 0.9549 0.7156
0.2192 46.0 27140 0.8422 0.7251
0.2253 47.0 27730 0.9386 0.7245
0.2063 48.0 28320 0.9686 0.7028
0.2258 49.0 28910 0.8843 0.7165
0.2114 50.0 29500 0.9566 0.7324
0.2039 51.0 30090 1.0167 0.7073
0.182 52.0 30680 0.9182 0.7303
0.1825 53.0 31270 0.9879 0.7147
0.1827 54.0 31860 0.9542 0.7199
0.1727 55.0 32450 0.9540 0.7245
0.1857 56.0 33040 0.9222 0.7294
0.182 57.0 33630 1.1263 0.7021
0.1716 58.0 34220 0.9947 0.7239
0.1659 59.0 34810 0.9969 0.7220
0.1596 60.0 35400 0.9764 0.7193
0.1656 61.0 35990 1.0089 0.7281
0.1545 62.0 36580 0.9712 0.7193
0.1429 63.0 37170 0.9785 0.7245
0.1567 64.0 37760 1.0706 0.7076
0.1493 65.0 38350 0.9546 0.7287
0.1453 66.0 38940 0.9959 0.7245
0.1384 67.0 39530 0.9687 0.7300
0.1409 68.0 40120 0.9739 0.7202
0.1388 69.0 40710 1.1173 0.7232
0.1366 70.0 41300 0.9598 0.7254
0.1429 71.0 41890 1.0048 0.7070
0.1384 72.0 42480 0.9816 0.7205
0.1221 73.0 43070 1.0827 0.7232
0.131 74.0 43660 1.0217 0.7294
0.1282 75.0 44250 0.9694 0.7287
0.1308 76.0 44840 1.0198 0.7208
0.1252 77.0 45430 1.0261 0.7278
0.1252 78.0 46020 0.9709 0.7272
0.117 79.0 46610 1.0140 0.7257
0.1171 80.0 47200 1.0226 0.7321
0.1132 81.0 47790 1.0880 0.7199
0.116 82.0 48380 0.9087 0.7254
0.1156 83.0 48970 0.9973 0.7257
0.103 84.0 49560 1.0078 0.7287
0.1096 85.0 50150 1.0122 0.7263
0.1097 86.0 50740 1.0316 0.7312
0.098 87.0 51330 1.0030 0.7275
0.1035 88.0 51920 0.9551 0.7214
0.0978 89.0 52510 1.0217 0.7287
0.1001 90.0 53100 0.9817 0.7291
0.1011 91.0 53690 0.9693 0.7281
0.0957 92.0 54280 1.0017 0.7199
0.0946 93.0 54870 0.9992 0.7278
0.0976 94.0 55460 0.9660 0.7291
0.0961 95.0 56050 0.9572 0.7278
0.0944 96.0 56640 0.9801 0.7269
0.0944 97.0 57230 0.9527 0.7272
0.0936 98.0 57820 0.9543 0.7266
0.0939 99.0 58410 0.9540 0.7281
0.0915 100.0 59000 0.9552 0.7294

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
2
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train Onutoa/1_6e-3_1_0.1