Edit model card

20230822202056

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1724
  • Accuracy: 0.7112

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.003
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 312 0.1785 0.5307
0.2552 2.0 624 0.1826 0.5054
0.2552 3.0 936 0.3328 0.4729
0.24 4.0 1248 0.2050 0.4729
0.2369 5.0 1560 0.1750 0.6065
0.2369 6.0 1872 0.1752 0.4765
0.2199 7.0 2184 0.1799 0.5921
0.2199 8.0 2496 0.1896 0.4729
0.1955 9.0 2808 0.1727 0.6245
0.185 10.0 3120 0.1734 0.5668
0.185 11.0 3432 0.1781 0.5812
0.184 12.0 3744 0.1711 0.6318
0.1819 13.0 4056 0.1783 0.4910
0.1819 14.0 4368 0.1703 0.6534
0.1793 15.0 4680 0.1697 0.6931
0.1793 16.0 4992 0.1710 0.6643
0.179 17.0 5304 0.1728 0.6534
0.1784 18.0 5616 0.1712 0.6498
0.1784 19.0 5928 0.1726 0.6065
0.1778 20.0 6240 0.1720 0.6679
0.1761 21.0 6552 0.1724 0.6606
0.1761 22.0 6864 0.1792 0.6534
0.1761 23.0 7176 0.1700 0.6715
0.1761 24.0 7488 0.1698 0.6679
0.1748 25.0 7800 0.1697 0.6968
0.1744 26.0 8112 0.1729 0.6859
0.1744 27.0 8424 0.1702 0.6570
0.1736 28.0 8736 0.1708 0.6931
0.1723 29.0 9048 0.1698 0.6787
0.1723 30.0 9360 0.1799 0.6462
0.1735 31.0 9672 0.1727 0.6751
0.1735 32.0 9984 0.1732 0.6498
0.1722 33.0 10296 0.1702 0.6751
0.1709 34.0 10608 0.1707 0.6968
0.1709 35.0 10920 0.1714 0.6968
0.1697 36.0 11232 0.1712 0.6751
0.1696 37.0 11544 0.1788 0.6570
0.1696 38.0 11856 0.1703 0.6787
0.1697 39.0 12168 0.1735 0.6751
0.1697 40.0 12480 0.1740 0.6787
0.1683 41.0 12792 0.1710 0.6895
0.1688 42.0 13104 0.1724 0.7076
0.1688 43.0 13416 0.1718 0.7004
0.1679 44.0 13728 0.1736 0.7040
0.1681 45.0 14040 0.1720 0.7040
0.1681 46.0 14352 0.1717 0.7076
0.1664 47.0 14664 0.1710 0.6895
0.1664 48.0 14976 0.1766 0.6895
0.1662 49.0 15288 0.1729 0.7040
0.1655 50.0 15600 0.1704 0.7076
0.1655 51.0 15912 0.1711 0.7184
0.1665 52.0 16224 0.1709 0.7040
0.1651 53.0 16536 0.1711 0.6931
0.1651 54.0 16848 0.1736 0.7040
0.1646 55.0 17160 0.1712 0.7112
0.1646 56.0 17472 0.1740 0.7076
0.1647 57.0 17784 0.1723 0.7076
0.1642 58.0 18096 0.1715 0.7004
0.1642 59.0 18408 0.1727 0.7076
0.1643 60.0 18720 0.1724 0.7112

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
4

Dataset used to train dkqjrm/20230822202056