Edit model card

20230823213602

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6021
  • Accuracy: 0.7076

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 312 0.8868 0.4693
0.7276 2.0 624 0.8623 0.4729
0.7276 3.0 936 0.6658 0.5126
0.7221 4.0 1248 0.7170 0.4838
0.6968 5.0 1560 0.6691 0.5668
0.6968 6.0 1872 0.6427 0.5740
0.6561 7.0 2184 0.8715 0.5379
0.6561 8.0 2496 0.6609 0.5668
0.6562 9.0 2808 0.6147 0.5993
0.6578 10.0 3120 0.6103 0.6065
0.6578 11.0 3432 0.7649 0.4838
0.6252 12.0 3744 0.5990 0.6426
0.6084 13.0 4056 0.5962 0.6462
0.6084 14.0 4368 0.5738 0.6679
0.5841 15.0 4680 0.6292 0.6534
0.5841 16.0 4992 0.7218 0.6354
0.5715 17.0 5304 0.5832 0.6643
0.5619 18.0 5616 0.5680 0.6787
0.5619 19.0 5928 0.7152 0.5957
0.5641 20.0 6240 0.7627 0.6462
0.5432 21.0 6552 0.5672 0.6895
0.5432 22.0 6864 0.6023 0.6787
0.5586 23.0 7176 0.6581 0.6859
0.5586 24.0 7488 0.5614 0.6895
0.5254 25.0 7800 0.7315 0.6679
0.5267 26.0 8112 0.5316 0.7076
0.5267 27.0 8424 0.5391 0.7004
0.5189 28.0 8736 0.5935 0.7040
0.5172 29.0 9048 0.5977 0.7076
0.5172 30.0 9360 0.5918 0.7148
0.5069 31.0 9672 0.7130 0.6751
0.5069 32.0 9984 0.6718 0.6931
0.4976 33.0 10296 0.5982 0.7112
0.4895 34.0 10608 0.5927 0.7076
0.4895 35.0 10920 0.5583 0.7148
0.4916 36.0 11232 0.5706 0.7076
0.4867 37.0 11544 0.6064 0.7112
0.4867 38.0 11856 0.5939 0.7040
0.4914 39.0 12168 0.6528 0.7112
0.4914 40.0 12480 0.5773 0.7148
0.4733 41.0 12792 0.5853 0.7148
0.4796 42.0 13104 0.5876 0.7329
0.4796 43.0 13416 0.6521 0.7112
0.4706 44.0 13728 0.6386 0.7004
0.4655 45.0 14040 0.5846 0.7401
0.4655 46.0 14352 0.6645 0.7004
0.4654 47.0 14664 0.5831 0.7292
0.4654 48.0 14976 0.6665 0.7040
0.4567 49.0 15288 0.5760 0.7220
0.4563 50.0 15600 0.5796 0.7292
0.4563 51.0 15912 0.5656 0.7256
0.4471 52.0 16224 0.5585 0.7329
0.4484 53.0 16536 0.6286 0.7076
0.4484 54.0 16848 0.6116 0.7040
0.4424 55.0 17160 0.5852 0.7220
0.4424 56.0 17472 0.6008 0.7040
0.4439 57.0 17784 0.5777 0.7292
0.442 58.0 18096 0.5915 0.7184
0.442 59.0 18408 0.5930 0.7148
0.4411 60.0 18720 0.6021 0.7076

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
5

Dataset used to train dkqjrm/20230823213602