Edit model card

20230822235943

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9555
  • Accuracy: 0.7437

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.003
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 156 0.8690 0.4729
No log 2.0 312 0.7262 0.5271
No log 3.0 468 0.7646 0.4693
0.8294 4.0 624 0.7044 0.5884
0.8294 5.0 780 0.7099 0.5884
0.8294 6.0 936 0.6449 0.6245
0.785 7.0 1092 0.7755 0.6245
0.785 8.0 1248 0.6443 0.6606
0.785 9.0 1404 0.6349 0.6859
0.6665 10.0 1560 0.9544 0.6462
0.6665 11.0 1716 0.6008 0.7184
0.6665 12.0 1872 0.6503 0.7076
0.6276 13.0 2028 0.6269 0.7076
0.6276 14.0 2184 0.5788 0.7148
0.6276 15.0 2340 0.6645 0.7076
0.6276 16.0 2496 0.9684 0.6426
0.587 17.0 2652 0.6227 0.7184
0.587 18.0 2808 0.6449 0.7076
0.587 19.0 2964 0.6651 0.7365
0.5287 20.0 3120 1.1324 0.6498
0.5287 21.0 3276 0.7391 0.6895
0.5287 22.0 3432 1.0194 0.6643
0.5035 23.0 3588 0.7838 0.7040
0.5035 24.0 3744 0.8647 0.7184
0.5035 25.0 3900 1.0974 0.6715
0.4533 26.0 4056 0.5861 0.7292
0.4533 27.0 4212 0.6685 0.7437
0.4533 28.0 4368 0.6998 0.7256
0.4398 29.0 4524 0.7596 0.7329
0.4398 30.0 4680 0.6967 0.7437
0.4398 31.0 4836 0.7041 0.7473
0.4398 32.0 4992 0.7617 0.7329
0.3837 33.0 5148 0.7991 0.7329
0.3837 34.0 5304 0.8229 0.7473
0.3837 35.0 5460 0.7745 0.7401
0.3471 36.0 5616 0.7787 0.7437
0.3471 37.0 5772 0.7991 0.7365
0.3471 38.0 5928 1.0206 0.7256
0.3303 39.0 6084 0.8977 0.7292
0.3303 40.0 6240 0.7327 0.7220
0.3303 41.0 6396 0.8102 0.7292
0.2991 42.0 6552 0.7347 0.7473
0.2991 43.0 6708 0.8677 0.7473
0.2991 44.0 6864 0.9774 0.7365
0.275 45.0 7020 0.8557 0.7581
0.275 46.0 7176 0.9789 0.7437
0.275 47.0 7332 1.0015 0.7437
0.275 48.0 7488 0.8450 0.7401
0.2596 49.0 7644 0.8222 0.7581
0.2596 50.0 7800 0.8968 0.7401
0.2596 51.0 7956 0.8584 0.7437
0.2469 52.0 8112 0.9157 0.7401
0.2469 53.0 8268 0.9732 0.7365
0.2469 54.0 8424 1.0671 0.7401
0.2303 55.0 8580 0.9512 0.7473
0.2303 56.0 8736 0.8708 0.7473
0.2303 57.0 8892 0.9290 0.7437
0.2275 58.0 9048 0.8866 0.7401
0.2275 59.0 9204 0.9366 0.7365
0.2275 60.0 9360 0.9555 0.7437

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
4

Dataset used to train dkqjrm/20230822235943