Edit model card

20230824210912

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0575
  • Accuracy: 0.7401

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.005
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 80.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 156 0.8273 0.5307
No log 2.0 312 1.1309 0.4729
No log 3.0 468 0.8140 0.4765
0.9525 4.0 624 0.6978 0.5776
0.9525 5.0 780 0.6845 0.5704
0.9525 6.0 936 0.6365 0.6282
0.8192 7.0 1092 0.8362 0.6354
0.8192 8.0 1248 0.5976 0.6859
0.8192 9.0 1404 0.6788 0.6751
0.7543 10.0 1560 0.6672 0.6606
0.7543 11.0 1716 0.6932 0.5776
0.7543 12.0 1872 0.6756 0.6895
0.6718 13.0 2028 0.6336 0.7292
0.6718 14.0 2184 0.6149 0.7256
0.6718 15.0 2340 0.7579 0.6570
0.6718 16.0 2496 0.8701 0.6137
0.6043 17.0 2652 0.5931 0.7256
0.6043 18.0 2808 0.5982 0.7256
0.6043 19.0 2964 0.6829 0.7148
0.5842 20.0 3120 1.3393 0.6354
0.5842 21.0 3276 0.7701 0.6823
0.5842 22.0 3432 0.7801 0.6679
0.5907 23.0 3588 0.6225 0.7401
0.5907 24.0 3744 0.7348 0.7292
0.5907 25.0 3900 0.7832 0.6859
0.5013 26.0 4056 0.5946 0.7329
0.5013 27.0 4212 0.6441 0.7365
0.5013 28.0 4368 0.6992 0.7112
0.4569 29.0 4524 0.8007 0.7329
0.4569 30.0 4680 1.1460 0.6643
0.4569 31.0 4836 1.1331 0.6606
0.4569 32.0 4992 0.7750 0.7220
0.4256 33.0 5148 0.8709 0.7256
0.4256 34.0 5304 0.8764 0.7184
0.4256 35.0 5460 0.8154 0.7256
0.3773 36.0 5616 0.8308 0.7329
0.3773 37.0 5772 0.8417 0.7184
0.3773 38.0 5928 1.1260 0.7401
0.3676 39.0 6084 0.8739 0.7401
0.3676 40.0 6240 0.7295 0.7509
0.3676 41.0 6396 1.0227 0.7220
0.3122 42.0 6552 1.2354 0.7184
0.3122 43.0 6708 0.9760 0.7401
0.3122 44.0 6864 0.8684 0.7329
0.3011 45.0 7020 0.9423 0.7545
0.3011 46.0 7176 1.0446 0.7401
0.3011 47.0 7332 1.2442 0.7256
0.3011 48.0 7488 0.8938 0.7292
0.2606 49.0 7644 1.0857 0.7220
0.2606 50.0 7800 1.1683 0.7148
0.2606 51.0 7956 0.9944 0.7220
0.2496 52.0 8112 0.9914 0.7401
0.2496 53.0 8268 1.0398 0.7365
0.2496 54.0 8424 1.2414 0.7256
0.2293 55.0 8580 1.0096 0.7220
0.2293 56.0 8736 0.9548 0.7365
0.2293 57.0 8892 1.2170 0.7220
0.2182 58.0 9048 1.1249 0.7220
0.2182 59.0 9204 1.1084 0.7292
0.2182 60.0 9360 1.0558 0.7292
0.2111 61.0 9516 1.1070 0.7292
0.2111 62.0 9672 1.1918 0.7473
0.2111 63.0 9828 1.1819 0.7220
0.2111 64.0 9984 1.1041 0.7437
0.2024 65.0 10140 1.2129 0.7184
0.2024 66.0 10296 1.0185 0.7437
0.2024 67.0 10452 0.9763 0.7437
0.1901 68.0 10608 1.0053 0.7292
0.1901 69.0 10764 1.1605 0.7292
0.1901 70.0 10920 1.3683 0.7220
0.1843 71.0 11076 1.0427 0.7365
0.1843 72.0 11232 1.1283 0.7437
0.1843 73.0 11388 1.0405 0.7473
0.1715 74.0 11544 0.9890 0.7509
0.1715 75.0 11700 1.2353 0.7329
0.1715 76.0 11856 1.0175 0.7365
0.1698 77.0 12012 1.0641 0.7365
0.1698 78.0 12168 1.0655 0.7292
0.1698 79.0 12324 1.0779 0.7329
0.1698 80.0 12480 1.0575 0.7401

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
11

Dataset used to train dkqjrm/20230824210912