Edit model card

20230824210941

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8686
  • Accuracy: 0.7256

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.005
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 80.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 156 0.8228 0.5307
No log 2.0 312 0.7014 0.5271
No log 3.0 468 0.9320 0.4657
0.9247 4.0 624 0.8551 0.5307
0.9247 5.0 780 0.8862 0.5235
0.9247 6.0 936 0.6306 0.6282
0.8754 7.0 1092 0.9270 0.5957
0.8754 8.0 1248 0.6627 0.6354
0.8754 9.0 1404 0.7200 0.6137
0.745 10.0 1560 0.5993 0.6751
0.745 11.0 1716 0.7300 0.6318
0.745 12.0 1872 0.7463 0.6823
0.6869 13.0 2028 0.8378 0.6029
0.6869 14.0 2184 0.6182 0.7076
0.6869 15.0 2340 0.9895 0.6209
0.6869 16.0 2496 0.7414 0.6859
0.6526 17.0 2652 0.6260 0.6931
0.6526 18.0 2808 0.5832 0.7365
0.6526 19.0 2964 0.6509 0.6968
0.5884 20.0 3120 0.7808 0.6751
0.5884 21.0 3276 0.6212 0.7437
0.5884 22.0 3432 0.8835 0.6354
0.5748 23.0 3588 0.8832 0.6570
0.5748 24.0 3744 0.8348 0.6679
0.5748 25.0 3900 0.8357 0.6859
0.5519 26.0 4056 0.5958 0.7256
0.5519 27.0 4212 0.5952 0.7365
0.5519 28.0 4368 0.6118 0.7256
0.5239 29.0 4524 0.8448 0.6823
0.5239 30.0 4680 0.6541 0.7112
0.5239 31.0 4836 0.9677 0.6390
0.5239 32.0 4992 0.7328 0.7076
0.4732 33.0 5148 0.8215 0.6643
0.4732 34.0 5304 0.7120 0.7112
0.4732 35.0 5460 0.7292 0.7437
0.4314 36.0 5616 0.7357 0.7220
0.4314 37.0 5772 1.0189 0.6606
0.4314 38.0 5928 0.7766 0.6787
0.4113 39.0 6084 0.9918 0.6679
0.4113 40.0 6240 0.8170 0.7329
0.4113 41.0 6396 0.7732 0.7184
0.3872 42.0 6552 0.7271 0.7653
0.3872 43.0 6708 0.8372 0.7365
0.3872 44.0 6864 0.8637 0.7148
0.3747 45.0 7020 0.8895 0.7220
0.3747 46.0 7176 1.3025 0.6931
0.3747 47.0 7332 0.8508 0.7437
0.3747 48.0 7488 0.9201 0.7220
0.3401 49.0 7644 1.0286 0.7184
0.3401 50.0 7800 0.8711 0.7365
0.3401 51.0 7956 1.0386 0.7256
0.3162 52.0 8112 0.8634 0.7401
0.3162 53.0 8268 0.9121 0.7184
0.3162 54.0 8424 0.8510 0.7292
0.3146 55.0 8580 0.8323 0.7329
0.3146 56.0 8736 1.1691 0.6968
0.3146 57.0 8892 0.9995 0.7292
0.3049 58.0 9048 0.8166 0.7184
0.3049 59.0 9204 1.0304 0.7184
0.3049 60.0 9360 0.8338 0.7184
0.2932 61.0 9516 0.8818 0.7220
0.2932 62.0 9672 1.0405 0.7184
0.2932 63.0 9828 0.9091 0.7112
0.2932 64.0 9984 0.9134 0.7256
0.2786 65.0 10140 0.8553 0.7329
0.2786 66.0 10296 0.9198 0.7365
0.2786 67.0 10452 0.8613 0.7329
0.2616 68.0 10608 0.8299 0.7292
0.2616 69.0 10764 0.9801 0.7148
0.2616 70.0 10920 0.8634 0.7256
0.2573 71.0 11076 0.8447 0.7509
0.2573 72.0 11232 0.8127 0.7437
0.2573 73.0 11388 0.8869 0.7256
0.248 74.0 11544 0.8170 0.7256
0.248 75.0 11700 0.9370 0.7220
0.248 76.0 11856 0.8273 0.7220
0.2513 77.0 12012 0.8745 0.7220
0.2513 78.0 12168 0.8785 0.7292
0.2513 79.0 12324 0.8585 0.7256
0.2513 80.0 12480 0.8686 0.7256

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
7
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train dkqjrm/20230824210941