Edit model card

20230822125408

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3480
  • Accuracy: 0.5271

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.005
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 312 0.3715 0.4729
0.6635 2.0 624 0.4551 0.5271
0.6635 3.0 936 0.4100 0.4729
0.6659 4.0 1248 0.5179 0.4729
0.5379 5.0 1560 0.4588 0.5271
0.5379 6.0 1872 0.3934 0.5271
0.4954 7.0 2184 0.4644 0.5271
0.4954 8.0 2496 0.6469 0.5271
0.4707 9.0 2808 0.3908 0.5271
0.4825 10.0 3120 0.4247 0.4729
0.4825 11.0 3432 0.3479 0.5271
0.4683 12.0 3744 0.3917 0.4729
0.4456 13.0 4056 0.3580 0.5271
0.4456 14.0 4368 0.3641 0.4729
0.4571 15.0 4680 0.3922 0.4729
0.4571 16.0 4992 0.3587 0.5271
0.434 17.0 5304 0.3769 0.4729
0.4707 18.0 5616 0.3520 0.5271
0.4707 19.0 5928 0.3489 0.5271
0.4863 20.0 6240 0.3593 0.5271
0.4673 21.0 6552 0.8486 0.5271
0.4673 22.0 6864 0.3714 0.5271
0.4746 23.0 7176 0.3496 0.5271
0.4746 24.0 7488 0.3694 0.4729
0.4365 25.0 7800 0.3542 0.5271
0.4254 26.0 8112 0.4693 0.5271
0.4254 27.0 8424 0.3827 0.4729
0.4293 28.0 8736 0.3866 0.4729
0.4221 29.0 9048 0.3484 0.5271
0.4221 30.0 9360 0.4155 0.5271
0.4128 31.0 9672 0.3497 0.5271
0.4128 32.0 9984 0.3560 0.4729
0.4064 33.0 10296 0.4237 0.5271
0.4039 34.0 10608 0.3890 0.4729
0.4039 35.0 10920 0.3478 0.5271
0.4026 36.0 11232 0.3497 0.5271
0.4037 37.0 11544 0.3748 0.5271
0.4037 38.0 11856 0.3533 0.5271
0.3933 39.0 12168 0.3547 0.4729
0.3933 40.0 12480 0.3565 0.4729
0.3935 41.0 12792 0.3601 0.4729
0.3896 42.0 13104 0.3571 0.4729
0.3896 43.0 13416 0.3490 0.5271
0.3841 44.0 13728 0.3499 0.5271
0.3836 45.0 14040 0.3624 0.5271
0.3836 46.0 14352 0.3484 0.5271
0.3785 47.0 14664 0.3582 0.4729
0.3785 48.0 14976 0.3541 0.4729
0.3775 49.0 15288 0.3500 0.5271
0.3727 50.0 15600 0.3544 0.4729
0.3727 51.0 15912 0.3481 0.5271
0.3713 52.0 16224 0.3600 0.4729
0.3694 53.0 16536 0.3494 0.5271
0.3694 54.0 16848 0.3502 0.5271
0.3664 55.0 17160 0.3482 0.5271
0.3664 56.0 17472 0.3482 0.5271
0.3636 57.0 17784 0.3480 0.5271
0.3612 58.0 18096 0.3478 0.5271
0.3612 59.0 18408 0.3480 0.5271
0.3589 60.0 18720 0.3480 0.5271

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
5

Dataset used to train dkqjrm/20230822125408