Edit model card

20230825183857

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5542
  • Accuracy: 0.7545

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.005
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 80.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 156 1.3372 0.5307
No log 2.0 312 0.6864 0.5162
No log 3.0 468 0.6919 0.4874
0.9682 4.0 624 0.6674 0.5451
0.9682 5.0 780 0.6774 0.5415
0.9682 6.0 936 0.5435 0.6498
0.8254 7.0 1092 0.7442 0.5235
0.8254 8.0 1248 0.4993 0.6679
0.8254 9.0 1404 0.5592 0.6570
0.741 10.0 1560 0.6748 0.6498
0.741 11.0 1716 0.9543 0.4729
0.741 12.0 1872 0.5518 0.7004
0.6941 13.0 2028 0.4643 0.7040
0.6941 14.0 2184 0.5154 0.7220
0.6941 15.0 2340 0.5493 0.6570
0.6941 16.0 2496 0.5450 0.6570
0.6291 17.0 2652 0.5940 0.7040
0.6291 18.0 2808 0.4530 0.6931
0.6291 19.0 2964 0.5100 0.7581
0.5831 20.0 3120 0.4821 0.6751
0.5831 21.0 3276 0.7629 0.6354
0.5831 22.0 3432 0.4882 0.7437
0.5334 23.0 3588 0.4779 0.7040
0.5334 24.0 3744 0.5483 0.7365
0.5334 25.0 3900 0.4978 0.7112
0.465 26.0 4056 0.4617 0.7220
0.465 27.0 4212 0.4768 0.7545
0.465 28.0 4368 0.5384 0.7545
0.4116 29.0 4524 0.4739 0.7401
0.4116 30.0 4680 0.7430 0.6895
0.4116 31.0 4836 0.7631 0.6426
0.4116 32.0 4992 0.4750 0.7365
0.3972 33.0 5148 0.5293 0.7509
0.3972 34.0 5304 0.5111 0.7545
0.3972 35.0 5460 0.4787 0.7617
0.3632 36.0 5616 0.5954 0.7617
0.3632 37.0 5772 0.6243 0.7509
0.3632 38.0 5928 0.6147 0.7256
0.334 39.0 6084 0.4867 0.7581
0.334 40.0 6240 0.5077 0.7545
0.334 41.0 6396 0.6957 0.7112
0.2964 42.0 6552 0.5827 0.7690
0.2964 43.0 6708 0.4632 0.7617
0.2964 44.0 6864 0.5142 0.7545
0.291 45.0 7020 0.5525 0.7617
0.291 46.0 7176 0.4876 0.7581
0.291 47.0 7332 0.5730 0.7617
0.291 48.0 7488 0.5040 0.7653
0.2478 49.0 7644 0.5468 0.7545
0.2478 50.0 7800 0.5621 0.7653
0.2478 51.0 7956 0.5678 0.7545
0.2549 52.0 8112 0.5960 0.7509
0.2549 53.0 8268 0.5923 0.7437
0.2549 54.0 8424 0.5902 0.7653
0.2303 55.0 8580 0.4664 0.7617
0.2303 56.0 8736 0.5903 0.7617
0.2303 57.0 8892 0.6671 0.7329
0.2122 58.0 9048 0.5309 0.7473
0.2122 59.0 9204 0.6262 0.7581
0.2122 60.0 9360 0.5361 0.7545
0.2039 61.0 9516 0.6225 0.7545
0.2039 62.0 9672 0.6425 0.7509
0.2039 63.0 9828 0.6376 0.7365
0.2039 64.0 9984 0.6124 0.7473
0.1952 65.0 10140 0.5522 0.7401
0.1952 66.0 10296 0.6943 0.7509
0.1952 67.0 10452 0.5358 0.7653
0.1855 68.0 10608 0.5289 0.7581
0.1855 69.0 10764 0.5713 0.7545
0.1855 70.0 10920 0.5293 0.7617
0.1792 71.0 11076 0.6354 0.7617
0.1792 72.0 11232 0.5219 0.7653
0.1792 73.0 11388 0.5897 0.7581
0.1683 74.0 11544 0.5471 0.7653
0.1683 75.0 11700 0.5273 0.7653
0.1683 76.0 11856 0.5517 0.7581
0.1711 77.0 12012 0.5440 0.7653
0.1711 78.0 12168 0.5506 0.7545
0.1711 79.0 12324 0.5671 0.7581
0.1711 80.0 12480 0.5542 0.7545

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
11

Dataset used to train dkqjrm/20230825183857