Edit model card

20230824164037

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7104
  • Accuracy: 0.7617

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.005
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 80.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 156 0.8540 0.5307
No log 2.0 312 0.6894 0.4838
No log 3.0 468 1.2065 0.4729
1.0004 4.0 624 0.6386 0.5487
1.0004 5.0 780 0.6979 0.5199
1.0004 6.0 936 0.6102 0.6173
0.8189 7.0 1092 0.9162 0.5848
0.8189 8.0 1248 0.7055 0.6282
0.8189 9.0 1404 0.5689 0.7004
0.7207 10.0 1560 1.0166 0.6282
0.7207 11.0 1716 0.8185 0.4946
0.7207 12.0 1872 0.5053 0.7148
0.6822 13.0 2028 0.5296 0.7184
0.6822 14.0 2184 0.6259 0.7040
0.6822 15.0 2340 0.9773 0.6426
0.6822 16.0 2496 0.7401 0.6462
0.6238 17.0 2652 0.4929 0.7148
0.6238 18.0 2808 0.5547 0.7256
0.6238 19.0 2964 0.5692 0.7220
0.5327 20.0 3120 0.9119 0.6498
0.5327 21.0 3276 0.6083 0.7004
0.5327 22.0 3432 0.5836 0.7112
0.4818 23.0 3588 0.5820 0.7292
0.4818 24.0 3744 0.5506 0.7292
0.4818 25.0 3900 0.6027 0.7256
0.4199 26.0 4056 0.5265 0.7437
0.4199 27.0 4212 0.6094 0.7076
0.4199 28.0 4368 0.6170 0.7220
0.4001 29.0 4524 0.5932 0.7329
0.4001 30.0 4680 0.6954 0.7220
0.4001 31.0 4836 0.6963 0.7437
0.4001 32.0 4992 0.6431 0.7545
0.3272 33.0 5148 0.9597 0.7040
0.3272 34.0 5304 0.6982 0.7365
0.3272 35.0 5460 0.6270 0.7437
0.2947 36.0 5616 1.0674 0.7004
0.2947 37.0 5772 0.8835 0.7256
0.2947 38.0 5928 0.9769 0.6859
0.266 39.0 6084 0.6855 0.7581
0.266 40.0 6240 0.7246 0.7509
0.266 41.0 6396 0.6901 0.7690
0.2254 42.0 6552 0.7170 0.7509
0.2254 43.0 6708 0.7532 0.7473
0.2254 44.0 6864 0.7347 0.7617
0.2188 45.0 7020 0.6478 0.7509
0.2188 46.0 7176 0.7903 0.7545
0.2188 47.0 7332 0.9367 0.7220
0.2188 48.0 7488 0.8417 0.7690
0.2166 49.0 7644 0.8226 0.7617
0.2166 50.0 7800 0.6278 0.7545
0.2166 51.0 7956 0.7471 0.7473
0.1828 52.0 8112 0.7728 0.7617
0.1828 53.0 8268 0.7733 0.7690
0.1828 54.0 8424 0.7554 0.7581
0.163 55.0 8580 0.8025 0.7653
0.163 56.0 8736 0.8769 0.7617
0.163 57.0 8892 0.6569 0.7473
0.1563 58.0 9048 0.7166 0.7653
0.1563 59.0 9204 0.8688 0.7617
0.1563 60.0 9360 0.7254 0.7617
0.1423 61.0 9516 0.8286 0.7545
0.1423 62.0 9672 0.7656 0.7545
0.1423 63.0 9828 0.8362 0.7617
0.1423 64.0 9984 0.7287 0.7617
0.1355 65.0 10140 0.8451 0.7581
0.1355 66.0 10296 0.6854 0.7617
0.1355 67.0 10452 0.7272 0.7581
0.1321 68.0 10608 0.6530 0.7617
0.1321 69.0 10764 0.8535 0.7653
0.1321 70.0 10920 0.7803 0.7653
0.1217 71.0 11076 0.7409 0.7617
0.1217 72.0 11232 0.7044 0.7617
0.1217 73.0 11388 0.6501 0.7653
0.1224 74.0 11544 0.7102 0.7617
0.1224 75.0 11700 0.7050 0.7617
0.1224 76.0 11856 0.7103 0.7617
0.1173 77.0 12012 0.6821 0.7617
0.1173 78.0 12168 0.7196 0.7617
0.1173 79.0 12324 0.7048 0.7617
0.1173 80.0 12480 0.7104 0.7617

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
10

Dataset used to train dkqjrm/20230824164037