Edit model card

20230822202110

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1679
  • Accuracy: 0.7148

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.003
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 156 0.4220 0.5271
No log 2.0 312 0.2767 0.4729
No log 3.0 468 0.4345 0.4729
0.2507 4.0 624 0.2006 0.5343
0.2507 5.0 780 0.1797 0.4729
0.2507 6.0 936 0.2180 0.5271
0.2023 7.0 1092 0.1726 0.5054
0.2023 8.0 1248 0.1811 0.4729
0.2023 9.0 1404 0.1828 0.5451
0.2077 10.0 1560 0.1921 0.5343
0.2077 11.0 1716 0.1772 0.4838
0.2077 12.0 1872 0.1724 0.6462
0.189 13.0 2028 0.1718 0.5379
0.189 14.0 2184 0.1728 0.5126
0.189 15.0 2340 0.1775 0.5126
0.189 16.0 2496 0.1813 0.5596
0.1803 17.0 2652 0.1739 0.6318
0.1803 18.0 2808 0.1718 0.6137
0.1803 19.0 2964 0.1711 0.6390
0.1791 20.0 3120 0.1797 0.5957
0.1791 21.0 3276 0.1710 0.6859
0.1791 22.0 3432 0.1729 0.6643
0.1781 23.0 3588 0.1701 0.6823
0.1781 24.0 3744 0.1706 0.6390
0.1781 25.0 3900 0.1708 0.6859
0.1765 26.0 4056 0.1697 0.6643
0.1765 27.0 4212 0.1698 0.6715
0.1765 28.0 4368 0.1710 0.6426
0.176 29.0 4524 0.1710 0.6931
0.176 30.0 4680 0.1703 0.6968
0.176 31.0 4836 0.1725 0.6570
0.176 32.0 4992 0.1699 0.6715
0.1749 33.0 5148 0.1710 0.6895
0.1749 34.0 5304 0.1694 0.7220
0.1749 35.0 5460 0.1700 0.6534
0.1739 36.0 5616 0.1690 0.7112
0.1739 37.0 5772 0.1685 0.7220
0.1739 38.0 5928 0.1696 0.7040
0.1738 39.0 6084 0.1688 0.7148
0.1738 40.0 6240 0.1692 0.7220
0.1738 41.0 6396 0.1683 0.7365
0.1726 42.0 6552 0.1690 0.6679
0.1726 43.0 6708 0.1679 0.7076
0.1726 44.0 6864 0.1691 0.7184
0.1719 45.0 7020 0.1692 0.7292
0.1719 46.0 7176 0.1685 0.7329
0.1719 47.0 7332 0.1684 0.7184
0.1719 48.0 7488 0.1690 0.7112
0.1712 49.0 7644 0.1690 0.7292
0.1712 50.0 7800 0.1685 0.6931
0.1712 51.0 7956 0.1680 0.7256
0.1705 52.0 8112 0.1687 0.7076
0.1705 53.0 8268 0.1685 0.7184
0.1705 54.0 8424 0.1689 0.7365
0.1705 55.0 8580 0.1677 0.7148
0.1705 56.0 8736 0.1694 0.7220
0.1705 57.0 8892 0.1682 0.7256
0.1692 58.0 9048 0.1684 0.7148
0.1692 59.0 9204 0.1679 0.7148
0.1692 60.0 9360 0.1679 0.7148

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
4

Dataset used to train dkqjrm/20230822202110