Edit model card

20230824064723

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6742
  • Accuracy: 0.7076

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.003
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 312 1.0968 0.5307
0.8903 2.0 624 0.9977 0.4729
0.8903 3.0 936 0.6500 0.5415
0.813 4.0 1248 0.8148 0.4729
0.7606 5.0 1560 0.6263 0.5993
0.7606 6.0 1872 0.7920 0.6245
0.7342 7.0 2184 1.2811 0.5884
0.7342 8.0 2496 0.5840 0.6462
0.6906 9.0 2808 0.5715 0.6751
0.6551 10.0 3120 0.5806 0.6859
0.6551 11.0 3432 0.5498 0.6823
0.6197 12.0 3744 0.6886 0.6968
0.5972 13.0 4056 1.1724 0.4477
0.5972 14.0 4368 0.6682 0.6101
0.7875 15.0 4680 0.6779 0.5560
0.7875 16.0 4992 0.9667 0.6354
0.6467 17.0 5304 0.9092 0.6606
0.5892 18.0 5616 0.6701 0.4621
0.5892 19.0 5928 0.6021 0.6643
0.6056 20.0 6240 0.8808 0.6787
0.5409 21.0 6552 0.5458 0.6751
0.5409 22.0 6864 0.5723 0.6859
0.5387 23.0 7176 0.9638 0.6679
0.5387 24.0 7488 0.7176 0.6968
0.511 25.0 7800 0.6557 0.6895
0.4744 26.0 8112 0.5338 0.7148
0.4744 27.0 8424 0.5646 0.7076
0.4743 28.0 8736 0.5423 0.7040
0.4598 29.0 9048 0.6324 0.7076
0.4598 30.0 9360 0.7069 0.7004
0.4485 31.0 9672 0.6809 0.6859
0.4485 32.0 9984 0.5675 0.7076
0.442 33.0 10296 0.8006 0.6895
0.4141 34.0 10608 0.5902 0.7112
0.4141 35.0 10920 0.6252 0.7148
0.4054 36.0 11232 0.8398 0.7112
0.3819 37.0 11544 0.7482 0.7004
0.3819 38.0 11856 0.6538 0.7112
0.3825 39.0 12168 0.7720 0.6968
0.3825 40.0 12480 0.6094 0.6931
0.379 41.0 12792 0.5863 0.7040
0.3701 42.0 13104 0.6197 0.7040
0.3701 43.0 13416 0.5795 0.7112
0.3576 44.0 13728 0.6484 0.7076
0.3454 45.0 14040 0.6623 0.6968
0.3454 46.0 14352 0.6562 0.7220
0.3455 47.0 14664 0.5921 0.7184
0.3455 48.0 14976 0.6980 0.7112
0.3344 49.0 15288 0.6210 0.7004
0.3285 50.0 15600 0.5674 0.7184
0.3285 51.0 15912 0.6134 0.7040
0.3295 52.0 16224 0.7118 0.7148
0.3181 53.0 16536 0.6978 0.7040
0.3181 54.0 16848 0.6851 0.7112
0.3021 55.0 17160 0.7702 0.7040
0.3021 56.0 17472 0.7319 0.7040
0.3044 57.0 17784 0.6459 0.7076
0.2938 58.0 18096 0.6386 0.7076
0.2938 59.0 18408 0.6550 0.7004
0.2991 60.0 18720 0.6742 0.7076

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
9

Dataset used to train dkqjrm/20230824064723