Edit model card

1_8e-3_10_0.1

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0109
  • Accuracy: 0.7272

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.008
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
1.8619 1.0 590 1.0251 0.4685
1.3275 2.0 1180 1.3329 0.3795
1.2711 3.0 1770 1.3427 0.3817
1.2563 4.0 2360 0.9486 0.6352
1.3677 5.0 2950 1.5968 0.4266
1.2101 6.0 3540 2.8999 0.6217
1.2131 7.0 4130 1.7592 0.4410
1.0951 8.0 4720 1.0889 0.6535
1.1265 9.0 5310 1.6306 0.4963
1.0834 10.0 5900 0.8228 0.6789
0.9934 11.0 6490 0.9519 0.6789
0.9867 12.0 7080 1.2001 0.6471
0.9321 13.0 7670 0.7980 0.6850
0.914 14.0 8260 0.7659 0.7092
0.9005 15.0 8850 0.8234 0.7104
0.8728 16.0 9440 0.9553 0.6948
0.7346 17.0 10030 2.0394 0.5012
0.8001 18.0 10620 1.2116 0.6180
0.8778 19.0 11210 0.8516 0.6823
0.7117 20.0 11800 1.1178 0.6251
0.6709 21.0 12390 0.8929 0.7125
0.7554 22.0 12980 0.9317 0.6801
0.7167 23.0 13570 1.3876 0.6061
0.6239 24.0 14160 0.9124 0.6737
0.6273 25.0 14750 0.8818 0.7242
0.5882 26.0 15340 1.0614 0.6728
0.5567 27.0 15930 1.0177 0.7306
0.5606 28.0 16520 1.3018 0.6459
0.5559 29.0 17110 1.4926 0.6914
0.4879 30.0 17700 0.9648 0.6924
0.4945 31.0 18290 0.9028 0.7150
0.4876 32.0 18880 0.8188 0.7257
0.455 33.0 19470 1.0325 0.7312
0.468 34.0 20060 0.9495 0.7330
0.4324 35.0 20650 0.8765 0.7202
0.4098 36.0 21240 1.5105 0.6963
0.4002 37.0 21830 0.9019 0.7309
0.4077 38.0 22420 0.8470 0.7223
0.378 39.0 23010 0.9477 0.7196
0.3697 40.0 23600 0.9213 0.7226
0.3957 41.0 24190 0.9321 0.7260
0.338 42.0 24780 0.8633 0.7284
0.343 43.0 25370 0.9502 0.7355
0.3454 44.0 25960 1.1264 0.6930
0.3288 45.0 26550 1.5310 0.6440
0.3075 46.0 27140 1.0321 0.7067
0.326 47.0 27730 1.0041 0.7257
0.3035 48.0 28320 0.9984 0.7168
0.3318 49.0 28910 0.9336 0.7294
0.2923 50.0 29500 1.2029 0.6758
0.2813 51.0 30090 0.9525 0.7217
0.2844 52.0 30680 1.0021 0.7242
0.2706 53.0 31270 0.9836 0.7187
0.2748 54.0 31860 0.9966 0.7113
0.2585 55.0 32450 1.0029 0.7211
0.2603 56.0 33040 0.9700 0.7235
0.2442 57.0 33630 0.9675 0.7330
0.2503 58.0 34220 1.0088 0.7373
0.2473 59.0 34810 0.9043 0.7306
0.2503 60.0 35400 1.0069 0.7211
0.233 61.0 35990 1.0046 0.7245
0.2248 62.0 36580 1.0468 0.7217
0.2343 63.0 37170 0.9263 0.7202
0.2312 64.0 37760 1.1075 0.7101
0.2173 65.0 38350 1.0439 0.7205
0.2138 66.0 38940 1.1012 0.7364
0.2037 67.0 39530 1.0094 0.7336
0.2129 68.0 40120 0.9811 0.7275
0.1937 69.0 40710 1.0312 0.7419
0.2102 70.0 41300 1.0208 0.7318
0.2078 71.0 41890 1.0093 0.7174
0.2037 72.0 42480 1.1041 0.7404
0.1903 73.0 43070 0.9927 0.7318
0.1898 74.0 43660 1.0875 0.7431
0.1966 75.0 44250 0.9659 0.7257
0.1967 76.0 44840 1.0025 0.7254
0.191 77.0 45430 0.9488 0.7306
0.1916 78.0 46020 1.0042 0.7327
0.1819 79.0 46610 1.0258 0.7355
0.1794 80.0 47200 1.0124 0.7309
0.1773 81.0 47790 0.9920 0.7324
0.1852 82.0 48380 1.0088 0.7367
0.1809 83.0 48970 1.0702 0.7352
0.1695 84.0 49560 1.0249 0.7260
0.1704 85.0 50150 1.0086 0.7294
0.1698 86.0 50740 1.0465 0.7318
0.1609 87.0 51330 1.0387 0.7291
0.1654 88.0 51920 1.0260 0.7297
0.1589 89.0 52510 1.0342 0.7257
0.1624 90.0 53100 1.0773 0.7297
0.1633 91.0 53690 1.0567 0.7309
0.1593 92.0 54280 1.0176 0.7196
0.1558 93.0 54870 1.0428 0.7257
0.1536 94.0 55460 1.0158 0.7294
0.1559 95.0 56050 1.0159 0.7315
0.1577 96.0 56640 1.0299 0.7306
0.1518 97.0 57230 1.0132 0.7281
0.1477 98.0 57820 0.9931 0.7266
0.1529 99.0 58410 1.0248 0.7272
0.1445 100.0 59000 1.0109 0.7272

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
4
Inference API
This model can be loaded on Inference API (serverless).

Dataset used to train Onutoa/1_8e-3_10_0.1