Edit model card

20230825091928

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1543
  • Accuracy: 0.7437

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.005
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 80.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 156 0.6113 0.5307
No log 2.0 312 0.9432 0.4693
No log 3.0 468 0.9610 0.4729
0.8937 4.0 624 0.5415 0.5487
0.8937 5.0 780 0.4722 0.6209
0.8937 6.0 936 0.4314 0.6390
0.7579 7.0 1092 0.7937 0.5704
0.7579 8.0 1248 0.4160 0.6282
0.7579 9.0 1404 0.3071 0.6787
0.7059 10.0 1560 0.4325 0.6498
0.7059 11.0 1716 0.7958 0.5090
0.7059 12.0 1872 0.3046 0.6823
0.654 13.0 2028 0.3405 0.7220
0.654 14.0 2184 0.2875 0.6751
0.654 15.0 2340 0.4266 0.6426
0.654 16.0 2496 0.5710 0.5957
0.6649 17.0 2652 0.3009 0.7256
0.6649 18.0 2808 0.7588 0.6534
0.6649 19.0 2964 0.2785 0.7292
0.5523 20.0 3120 0.2400 0.6895
0.5523 21.0 3276 0.2582 0.6859
0.5523 22.0 3432 0.3514 0.6462
0.511 23.0 3588 0.2163 0.7112
0.511 24.0 3744 0.2226 0.7076
0.511 25.0 3900 0.2138 0.7148
0.4948 26.0 4056 0.2851 0.7437
0.4948 27.0 4212 0.2584 0.7220
0.4948 28.0 4368 0.2217 0.7401
0.4342 29.0 4524 0.2014 0.7076
0.4342 30.0 4680 0.1907 0.7184
0.4342 31.0 4836 0.2176 0.7076
0.4342 32.0 4992 0.1863 0.7184
0.4098 33.0 5148 0.1862 0.7292
0.4098 34.0 5304 0.2253 0.7292
0.4098 35.0 5460 0.1960 0.7256
0.3743 36.0 5616 0.2416 0.7401
0.3743 37.0 5772 0.1988 0.7292
0.3743 38.0 5928 0.2031 0.7076
0.3477 39.0 6084 0.1847 0.7292
0.3477 40.0 6240 0.2001 0.7220
0.3477 41.0 6396 0.1955 0.7401
0.3221 42.0 6552 0.2075 0.7329
0.3221 43.0 6708 0.1751 0.7365
0.3221 44.0 6864 0.2256 0.7148
0.3034 45.0 7020 0.1913 0.7329
0.3034 46.0 7176 0.1867 0.7437
0.3034 47.0 7332 0.1842 0.7292
0.3034 48.0 7488 0.1719 0.7365
0.2656 49.0 7644 0.1810 0.7617
0.2656 50.0 7800 0.2172 0.7256
0.2656 51.0 7956 0.2065 0.7545
0.2676 52.0 8112 0.1682 0.7473
0.2676 53.0 8268 0.1819 0.7329
0.2676 54.0 8424 0.1703 0.7509
0.2396 55.0 8580 0.1971 0.7509
0.2396 56.0 8736 0.1889 0.7365
0.2396 57.0 8892 0.2933 0.6968
0.2355 58.0 9048 0.1650 0.7509
0.2355 59.0 9204 0.1760 0.7473
0.2355 60.0 9360 0.1553 0.7581
0.2196 61.0 9516 0.1707 0.7437
0.2196 62.0 9672 0.1933 0.7401
0.2196 63.0 9828 0.1726 0.7401
0.2196 64.0 9984 0.1654 0.7509
0.2114 65.0 10140 0.1783 0.7401
0.2114 66.0 10296 0.1724 0.7473
0.2114 67.0 10452 0.1647 0.7473
0.208 68.0 10608 0.1734 0.7437
0.208 69.0 10764 0.1640 0.7365
0.208 70.0 10920 0.1953 0.7329
0.2014 71.0 11076 0.1550 0.7509
0.2014 72.0 11232 0.1781 0.7509
0.2014 73.0 11388 0.1687 0.7365
0.1906 74.0 11544 0.1695 0.7473
0.1906 75.0 11700 0.1560 0.7509
0.1906 76.0 11856 0.1532 0.7509
0.1864 77.0 12012 0.1524 0.7401
0.1864 78.0 12168 0.1537 0.7545
0.1864 79.0 12324 0.1531 0.7509
0.1864 80.0 12480 0.1543 0.7437

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
11

Dataset used to train dkqjrm/20230825091928