Edit model card

20230822105337

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3531
  • Accuracy: 0.5271

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 312 0.3571 0.4729
6.8951 2.0 624 1.4465 0.5271
6.8951 3.0 936 1.4737 0.4729
4.4457 4.0 1248 0.4591 0.5271
3.2957 5.0 1560 0.4022 0.4729
3.2957 6.0 1872 1.2355 0.4729
3.7646 7.0 2184 9.3766 0.4729
3.7646 8.0 2496 0.3764 0.5271
3.3825 9.0 2808 4.6165 0.5271
2.7848 10.0 3120 3.2620 0.5271
2.7848 11.0 3432 2.3010 0.5271
2.3837 12.0 3744 0.3484 0.5271
2.1666 13.0 4056 0.4398 0.5271
2.1666 14.0 4368 1.4703 0.4729
2.107 15.0 4680 1.0550 0.5271
2.107 16.0 4992 1.0008 0.4729
2.161 17.0 5304 0.7810 0.4729
1.927 18.0 5616 0.8418 0.4729
1.927 19.0 5928 0.5166 0.4729
1.8072 20.0 6240 0.3493 0.5271
1.7187 21.0 6552 1.4221 0.5271
1.7187 22.0 6864 2.9356 0.5271
2.1333 23.0 7176 0.8474 0.4729
2.1333 24.0 7488 5.1220 0.4729
2.0017 25.0 7800 0.3589 0.4729
1.6518 26.0 8112 0.3996 0.4729
1.6518 27.0 8424 0.5351 0.5271
1.5012 28.0 8736 0.3479 0.5271
1.4194 29.0 9048 0.3492 0.5271
1.4194 30.0 9360 0.6942 0.5271
1.3048 31.0 9672 0.5089 0.5271
1.3048 32.0 9984 1.1509 0.5271
1.2972 33.0 10296 1.1207 0.4729
1.1774 34.0 10608 1.4443 0.4729
1.1774 35.0 10920 2.3753 0.4729
1.492 36.0 11232 0.3622 0.4729
1.3617 37.0 11544 1.3564 0.5271
1.3617 38.0 11856 0.6944 0.5271
1.4582 39.0 12168 0.5510 0.4729
1.4582 40.0 12480 0.3660 0.5271
1.0904 41.0 12792 0.3480 0.5271
0.9409 42.0 13104 0.4835 0.5271
0.9409 43.0 13416 0.6226 0.4729
0.9404 44.0 13728 0.4021 0.4729
0.8008 45.0 14040 0.5381 0.5271
0.8008 46.0 14352 0.3887 0.4729
0.841 47.0 14664 0.3763 0.5271
0.841 48.0 14976 0.3667 0.5271
0.6912 49.0 15288 0.4490 0.4729
0.6381 50.0 15600 0.7097 0.5271
0.6381 51.0 15912 0.3639 0.4729
0.5792 52.0 16224 0.3798 0.5271
0.53 53.0 16536 0.3854 0.4729
0.53 54.0 16848 0.3884 0.4729
0.4977 55.0 17160 0.3898 0.4729
0.4977 56.0 17472 0.3480 0.5271
0.4596 57.0 17784 0.3542 0.4729
0.4228 58.0 18096 0.3539 0.5271
0.4228 59.0 18408 0.3499 0.5271
0.3933 60.0 18720 0.3531 0.5271

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
5

Dataset used to train dkqjrm/20230822105337