Edit model card

20230824164412

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 1.4601
  • Accuracy: 0.7617

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.005
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 80.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 156 1.2623 0.4729
No log 2.0 312 1.7988 0.4729
No log 3.0 468 1.1894 0.5596
1.5423 4.0 624 1.1452 0.6029
1.5423 5.0 780 1.5302 0.5704
1.5423 6.0 936 1.0779 0.6643
1.2833 7.0 1092 1.3023 0.6643
1.2833 8.0 1248 1.0901 0.6787
1.2833 9.0 1404 1.0524 0.7040
1.137 10.0 1560 1.1486 0.7040
1.137 11.0 1716 0.9741 0.7220
1.137 12.0 1872 0.9392 0.7401
1.0902 13.0 2028 0.9919 0.7329
1.0902 14.0 2184 0.9693 0.7292
1.0902 15.0 2340 1.3303 0.6570
1.0902 16.0 2496 1.6827 0.6245
0.9851 17.0 2652 1.0073 0.7220
0.9851 18.0 2808 1.0058 0.7220
0.9851 19.0 2964 1.0158 0.7437
0.8583 20.0 3120 1.9128 0.6679
0.8583 21.0 3276 1.0963 0.7148
0.8583 22.0 3432 1.3230 0.7184
0.7482 23.0 3588 1.3272 0.7040
0.7482 24.0 3744 1.2003 0.7401
0.7482 25.0 3900 1.4140 0.7076
0.6935 26.0 4056 1.1536 0.7509
0.6935 27.0 4212 1.1267 0.7401
0.6935 28.0 4368 1.1595 0.7473
0.6056 29.0 4524 1.4403 0.7401
0.6056 30.0 4680 1.3101 0.7617
0.6056 31.0 4836 1.8018 0.7040
0.6056 32.0 4992 1.1681 0.7653
0.5191 33.0 5148 1.5214 0.7690
0.5191 34.0 5304 1.2349 0.7509
0.5191 35.0 5460 1.3993 0.7437
0.4549 36.0 5616 1.5260 0.7040
0.4549 37.0 5772 1.5437 0.7401
0.4549 38.0 5928 1.4679 0.7401
0.4181 39.0 6084 1.5237 0.7437
0.4181 40.0 6240 1.2788 0.7545
0.4181 41.0 6396 1.2741 0.7581
0.3694 42.0 6552 1.4069 0.7653
0.3694 43.0 6708 1.6243 0.7473
0.3694 44.0 6864 1.5139 0.7509
0.3477 45.0 7020 1.3648 0.7617
0.3477 46.0 7176 1.3082 0.7581
0.3477 47.0 7332 1.3837 0.7509
0.3477 48.0 7488 1.4072 0.7726
0.3048 49.0 7644 1.3494 0.7690
0.3048 50.0 7800 1.5970 0.7437
0.3048 51.0 7956 1.5230 0.7509
0.2879 52.0 8112 1.4555 0.7690
0.2879 53.0 8268 1.6442 0.7437
0.2879 54.0 8424 1.4267 0.7473
0.2545 55.0 8580 1.4977 0.7473
0.2545 56.0 8736 1.5389 0.7509
0.2545 57.0 8892 1.2889 0.7581
0.2434 58.0 9048 1.5166 0.7545
0.2434 59.0 9204 1.5143 0.7581
0.2434 60.0 9360 1.6968 0.7437
0.2309 61.0 9516 1.6144 0.7545
0.2309 62.0 9672 1.5494 0.7581
0.2309 63.0 9828 1.4832 0.7545
0.2309 64.0 9984 1.4073 0.7581
0.2194 65.0 10140 1.4524 0.7581
0.2194 66.0 10296 1.4490 0.7509
0.2194 67.0 10452 1.5948 0.7545
0.2037 68.0 10608 1.5180 0.7653
0.2037 69.0 10764 1.6394 0.7581
0.2037 70.0 10920 1.5999 0.7617
0.2017 71.0 11076 1.3414 0.7653
0.2017 72.0 11232 1.4794 0.7617
0.2017 73.0 11388 1.3894 0.7653
0.1889 74.0 11544 1.3723 0.7690
0.1889 75.0 11700 1.4901 0.7581
0.1889 76.0 11856 1.4329 0.7617
0.1929 77.0 12012 1.4548 0.7653
0.1929 78.0 12168 1.4404 0.7617
0.1929 79.0 12324 1.4248 0.7653
0.1929 80.0 12480 1.4601 0.7617

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
10

Dataset used to train dkqjrm/20230824164412