Edit model card

20230822202124

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4836
  • Accuracy: 0.7437

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.003
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 156 0.5548 0.4693
No log 2.0 312 0.5565 0.4838
No log 3.0 468 0.5531 0.4729
0.6259 4.0 624 0.5810 0.4729
0.6259 5.0 780 0.6010 0.5596
0.6259 6.0 936 0.4969 0.6462
0.5907 7.0 1092 0.7982 0.5487
0.5907 8.0 1248 0.4883 0.6318
0.5907 9.0 1404 0.4714 0.6931
0.5602 10.0 1560 0.9236 0.5560
0.5602 11.0 1716 0.4972 0.6968
0.5602 12.0 1872 0.5116 0.6895
0.5015 13.0 2028 0.4913 0.7076
0.5015 14.0 2184 0.4683 0.7112
0.5015 15.0 2340 0.5265 0.6895
0.5015 16.0 2496 0.4616 0.7040
0.4782 17.0 2652 0.5788 0.6679
0.4782 18.0 2808 0.4471 0.7292
0.4782 19.0 2964 0.4588 0.7545
0.4628 20.0 3120 0.6477 0.6426
0.4628 21.0 3276 0.5305 0.6968
0.4628 22.0 3432 0.4549 0.7292
0.4248 23.0 3588 0.5101 0.7256
0.4248 24.0 3744 0.4763 0.7184
0.4248 25.0 3900 0.5809 0.6895
0.4067 26.0 4056 0.4461 0.7473
0.4067 27.0 4212 0.4460 0.7473
0.4067 28.0 4368 0.4454 0.7509
0.3941 29.0 4524 0.4664 0.7365
0.3941 30.0 4680 0.5039 0.7292
0.3941 31.0 4836 0.4548 0.7473
0.3941 32.0 4992 0.4484 0.7437
0.3749 33.0 5148 0.4924 0.7473
0.3749 34.0 5304 0.4569 0.7473
0.3749 35.0 5460 0.4604 0.7617
0.3586 36.0 5616 0.4448 0.7653
0.3586 37.0 5772 0.4768 0.7365
0.3586 38.0 5928 0.5052 0.7473
0.3521 39.0 6084 0.5167 0.7329
0.3521 40.0 6240 0.4425 0.7509
0.3521 41.0 6396 0.4730 0.7545
0.3407 42.0 6552 0.4624 0.7509
0.3407 43.0 6708 0.4847 0.7509
0.3407 44.0 6864 0.5371 0.7329
0.3329 45.0 7020 0.4841 0.7545
0.3329 46.0 7176 0.4815 0.7365
0.3329 47.0 7332 0.4678 0.7509
0.3329 48.0 7488 0.4918 0.7473
0.3235 49.0 7644 0.4592 0.7581
0.3235 50.0 7800 0.5005 0.7437
0.3235 51.0 7956 0.4777 0.7545
0.3193 52.0 8112 0.4558 0.7545
0.3193 53.0 8268 0.4870 0.7437
0.3193 54.0 8424 0.4792 0.7437
0.3132 55.0 8580 0.4673 0.7437
0.3132 56.0 8736 0.4943 0.7437
0.3132 57.0 8892 0.4970 0.7437
0.311 58.0 9048 0.4914 0.7401
0.311 59.0 9204 0.4887 0.7437
0.311 60.0 9360 0.4836 0.7437

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
4

Dataset used to train dkqjrm/20230822202124