Edit model card

20230825024137

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5934
  • Accuracy: 0.7545

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.005
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 80.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 156 0.7579 0.5307
No log 2.0 312 0.8311 0.4838
No log 3.0 468 0.8071 0.4838
0.9372 4.0 624 0.6483 0.5632
0.9372 5.0 780 0.6240 0.5740
0.9372 6.0 936 0.6779 0.5343
0.9135 7.0 1092 0.8693 0.5632
0.9135 8.0 1248 0.6308 0.6245
0.9135 9.0 1404 0.6566 0.6462
0.7837 10.0 1560 0.5220 0.6787
0.7837 11.0 1716 0.6467 0.6390
0.7837 12.0 1872 0.5238 0.7220
0.6625 13.0 2028 0.5079 0.7040
0.6625 14.0 2184 0.5625 0.7148
0.6625 15.0 2340 0.4786 0.7148
0.6625 16.0 2496 0.7720 0.6426
0.6308 17.0 2652 0.4866 0.7004
0.6308 18.0 2808 0.4569 0.7329
0.6308 19.0 2964 0.4564 0.7329
0.6613 20.0 3120 0.6097 0.6823
0.6613 21.0 3276 0.5519 0.7112
0.6613 22.0 3432 0.6481 0.6679
0.5641 23.0 3588 0.5730 0.7040
0.5641 24.0 3744 0.5306 0.7076
0.5641 25.0 3900 0.9908 0.6606
0.5287 26.0 4056 0.4475 0.7545
0.5287 27.0 4212 0.4697 0.7473
0.5287 28.0 4368 0.5206 0.7040
0.5013 29.0 4524 0.4780 0.7401
0.5013 30.0 4680 0.6273 0.6787
0.5013 31.0 4836 0.6055 0.7076
0.5013 32.0 4992 0.4938 0.7401
0.4549 33.0 5148 0.5795 0.6931
0.4549 34.0 5304 0.5363 0.7473
0.4549 35.0 5460 0.6151 0.7473
0.4277 36.0 5616 0.6209 0.7184
0.4277 37.0 5772 0.6833 0.7365
0.4277 38.0 5928 0.5973 0.7220
0.4108 39.0 6084 0.5932 0.7581
0.4108 40.0 6240 0.4805 0.7437
0.4108 41.0 6396 0.5420 0.7401
0.3987 42.0 6552 0.5820 0.7617
0.3987 43.0 6708 0.5805 0.7292
0.3987 44.0 6864 0.6143 0.7473
0.3785 45.0 7020 0.5329 0.7292
0.3785 46.0 7176 0.7527 0.7329
0.3785 47.0 7332 0.7544 0.7256
0.3785 48.0 7488 0.6422 0.7292
0.3435 49.0 7644 0.7194 0.7401
0.3435 50.0 7800 0.5689 0.7401
0.3435 51.0 7956 0.5635 0.7329
0.3287 52.0 8112 0.6496 0.7473
0.3287 53.0 8268 0.6724 0.7220
0.3287 54.0 8424 0.7439 0.7220
0.3222 55.0 8580 0.5962 0.7365
0.3222 56.0 8736 0.5646 0.7437
0.3222 57.0 8892 0.6753 0.7401
0.2983 58.0 9048 0.5726 0.7401
0.2983 59.0 9204 0.7394 0.7256
0.2983 60.0 9360 0.5553 0.7473
0.2927 61.0 9516 0.6227 0.7256
0.2927 62.0 9672 0.6228 0.7365
0.2927 63.0 9828 0.7299 0.7365
0.2927 64.0 9984 0.6317 0.7329
0.2846 65.0 10140 0.5696 0.7401
0.2846 66.0 10296 0.6101 0.7509
0.2846 67.0 10452 0.5972 0.7437
0.266 68.0 10608 0.5472 0.7401
0.266 69.0 10764 0.6013 0.7437
0.266 70.0 10920 0.6242 0.7256
0.257 71.0 11076 0.5784 0.7509
0.257 72.0 11232 0.6293 0.7581
0.257 73.0 11388 0.6099 0.7509
0.2453 74.0 11544 0.6221 0.7401
0.2453 75.0 11700 0.6113 0.7437
0.2453 76.0 11856 0.5898 0.7401
0.2477 77.0 12012 0.5996 0.7545
0.2477 78.0 12168 0.6357 0.7509
0.2477 79.0 12324 0.5859 0.7509
0.2477 80.0 12480 0.5934 0.7545

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
5

Dataset used to train dkqjrm/20230825024137