Edit model card

20230825093306

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1749
  • Accuracy: 0.7545

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.005
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 80.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 156 0.9098 0.5307
No log 2.0 312 1.3904 0.4765
No log 3.0 468 1.0371 0.4729
0.9793 4.0 624 0.6882 0.5090
0.9793 5.0 780 0.5519 0.5523
0.9793 6.0 936 0.6019 0.5560
0.8653 7.0 1092 0.6463 0.5596
0.8653 8.0 1248 0.4313 0.6245
0.8653 9.0 1404 0.3395 0.6787
0.7164 10.0 1560 0.6637 0.5921
0.7164 11.0 1716 0.2853 0.6859
0.7164 12.0 1872 0.3014 0.7112
0.6696 13.0 2028 0.3778 0.6895
0.6696 14.0 2184 0.2711 0.7184
0.6696 15.0 2340 0.2947 0.6643
0.6696 16.0 2496 0.4965 0.6282
0.5962 17.0 2652 0.3037 0.7184
0.5962 18.0 2808 0.4431 0.7184
0.5962 19.0 2964 0.2407 0.7184
0.5972 20.0 3120 0.2475 0.7148
0.5972 21.0 3276 0.2248 0.7329
0.5972 22.0 3432 0.3476 0.6643
0.567 23.0 3588 0.2318 0.7112
0.567 24.0 3744 0.3517 0.7292
0.567 25.0 3900 0.3102 0.6643
0.5253 26.0 4056 0.2331 0.7148
0.5253 27.0 4212 0.3600 0.7292
0.5253 28.0 4368 0.1932 0.7292
0.5076 29.0 4524 0.1979 0.7292
0.5076 30.0 4680 0.2349 0.7437
0.5076 31.0 4836 0.2877 0.6715
0.5076 32.0 4992 0.2023 0.7401
0.4592 33.0 5148 0.2016 0.7437
0.4592 34.0 5304 0.2073 0.7076
0.4592 35.0 5460 0.2725 0.7617
0.434 36.0 5616 0.3714 0.6534
0.434 37.0 5772 0.2117 0.7112
0.434 38.0 5928 0.2338 0.6968
0.4114 39.0 6084 0.2117 0.7148
0.4114 40.0 6240 0.2254 0.7148
0.4114 41.0 6396 0.1978 0.7509
0.3906 42.0 6552 0.1965 0.7401
0.3906 43.0 6708 0.1828 0.7329
0.3906 44.0 6864 0.1891 0.7473
0.3651 45.0 7020 0.1917 0.7509
0.3651 46.0 7176 0.1888 0.7329
0.3651 47.0 7332 0.2906 0.7690
0.3651 48.0 7488 0.1945 0.7365
0.3358 49.0 7644 0.2083 0.7401
0.3358 50.0 7800 0.1822 0.7437
0.3358 51.0 7956 0.1848 0.7437
0.324 52.0 8112 0.1706 0.7437
0.324 53.0 8268 0.2049 0.7365
0.324 54.0 8424 0.1933 0.7509
0.3105 55.0 8580 0.1782 0.7365
0.3105 56.0 8736 0.1809 0.7365
0.3105 57.0 8892 0.1788 0.7292
0.2976 58.0 9048 0.2209 0.7617
0.2976 59.0 9204 0.1784 0.7473
0.2976 60.0 9360 0.1750 0.7617
0.2867 61.0 9516 0.1884 0.7401
0.2867 62.0 9672 0.1805 0.7509
0.2867 63.0 9828 0.1828 0.7509
0.2867 64.0 9984 0.1863 0.7545
0.2852 65.0 10140 0.1818 0.7581
0.2852 66.0 10296 0.1778 0.7545
0.2852 67.0 10452 0.1908 0.7581
0.2663 68.0 10608 0.1799 0.7545
0.2663 69.0 10764 0.1808 0.7581
0.2663 70.0 10920 0.1797 0.7437
0.2681 71.0 11076 0.1835 0.7581
0.2681 72.0 11232 0.1812 0.7581
0.2681 73.0 11388 0.1799 0.7617
0.2564 74.0 11544 0.1874 0.7581
0.2564 75.0 11700 0.1766 0.7581
0.2564 76.0 11856 0.1782 0.7545
0.2633 77.0 12012 0.1772 0.7545
0.2633 78.0 12168 0.1743 0.7617
0.2633 79.0 12324 0.1749 0.7545
0.2633 80.0 12480 0.1749 0.7545

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
7
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train dkqjrm/20230825093306