Edit model card

20230822202040

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5208
  • Accuracy: 0.7365

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.003
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 312 0.7722 0.5271
0.7133 2.0 624 0.5588 0.4982
0.7133 3.0 936 0.6273 0.4729
0.6364 4.0 1248 0.5976 0.4946
0.6219 5.0 1560 0.7382 0.5415
0.6219 6.0 1872 0.5328 0.6282
0.5974 7.0 2184 0.5253 0.6282
0.5974 8.0 2496 0.8677 0.5668
0.5614 9.0 2808 0.5249 0.5884
0.5732 10.0 3120 0.5113 0.6895
0.5732 11.0 3432 0.5092 0.6931
0.5559 12.0 3744 0.4693 0.7148
0.5301 13.0 4056 0.4781 0.7256
0.5301 14.0 4368 0.5693 0.6823
0.4999 15.0 4680 0.4649 0.7256
0.4999 16.0 4992 0.5702 0.6859
0.4712 17.0 5304 0.4598 0.7401
0.4431 18.0 5616 0.4750 0.7076
0.4431 19.0 5928 0.4782 0.7184
0.4348 20.0 6240 0.6236 0.6570
0.4113 21.0 6552 0.5125 0.7473
0.4113 22.0 6864 0.5703 0.6787
0.4035 23.0 7176 0.5080 0.7112
0.4035 24.0 7488 0.4619 0.7365
0.3898 25.0 7800 0.5639 0.7076
0.3736 26.0 8112 0.4968 0.7292
0.3736 27.0 8424 0.4483 0.7509
0.3708 28.0 8736 0.4929 0.7220
0.3656 29.0 9048 0.5168 0.7401
0.3656 30.0 9360 0.5618 0.7256
0.3545 31.0 9672 0.4900 0.7365
0.3545 32.0 9984 0.4676 0.7256
0.3474 33.0 10296 0.5222 0.7220
0.3326 34.0 10608 0.4861 0.7437
0.3326 35.0 10920 0.4560 0.7401
0.3313 36.0 11232 0.5375 0.7256
0.3209 37.0 11544 0.5606 0.7329
0.3209 38.0 11856 0.5173 0.7401
0.3169 39.0 12168 0.5060 0.7329
0.3169 40.0 12480 0.5250 0.7365
0.3096 41.0 12792 0.5133 0.7256
0.3097 42.0 13104 0.5012 0.7437
0.3097 43.0 13416 0.5274 0.7401
0.3049 44.0 13728 0.5086 0.7329
0.2929 45.0 14040 0.4934 0.7329
0.2929 46.0 14352 0.5667 0.7401
0.293 47.0 14664 0.5047 0.7437
0.293 48.0 14976 0.5353 0.7292
0.291 49.0 15288 0.5280 0.7401
0.2817 50.0 15600 0.5142 0.7365
0.2817 51.0 15912 0.5141 0.7329
0.2822 52.0 16224 0.4990 0.7329
0.2758 53.0 16536 0.5074 0.7292
0.2758 54.0 16848 0.5147 0.7329
0.2763 55.0 17160 0.5138 0.7365
0.2763 56.0 17472 0.5291 0.7365
0.2782 57.0 17784 0.5204 0.7329
0.272 58.0 18096 0.5093 0.7365
0.272 59.0 18408 0.5217 0.7365
0.2758 60.0 18720 0.5208 0.7365

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
1

Dataset used to train dkqjrm/20230822202040