Edit model card

20230823213639

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 1.3551
  • Accuracy: 0.7545

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.003
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 312 1.1031 0.5307
0.9187 2.0 624 0.7935 0.4874
0.9187 3.0 936 0.7082 0.5704
0.8508 4.0 1248 0.6713 0.6065
0.8272 5.0 1560 0.6997 0.6390
0.8272 6.0 1872 0.8815 0.6426
0.722 7.0 2184 1.0092 0.6318
0.722 8.0 2496 0.7370 0.6751
0.7377 9.0 2808 0.6362 0.7076
0.6952 10.0 3120 0.9842 0.6570
0.6952 11.0 3432 0.7133 0.7040
0.672 12.0 3744 0.7288 0.6823
0.6344 13.0 4056 0.7260 0.7220
0.6344 14.0 4368 0.6437 0.7112
0.6039 15.0 4680 0.7529 0.7184
0.6039 16.0 4992 1.0284 0.6787
0.5952 17.0 5304 0.8757 0.7256
0.5371 18.0 5616 0.6932 0.7329
0.5371 19.0 5928 0.7127 0.7148
0.5411 20.0 6240 1.0835 0.6823
0.4985 21.0 6552 0.9109 0.7292
0.4985 22.0 6864 1.4054 0.6643
0.4897 23.0 7176 1.0748 0.7112
0.4897 24.0 7488 1.1041 0.7256
0.4498 25.0 7800 1.0205 0.7040
0.4208 26.0 8112 1.0637 0.7148
0.4208 27.0 8424 0.8231 0.7329
0.4024 28.0 8736 0.7506 0.7401
0.4083 29.0 9048 1.1923 0.7184
0.4083 30.0 9360 1.2166 0.7184
0.3497 31.0 9672 1.2273 0.7220
0.3497 32.0 9984 0.9219 0.7437
0.3188 33.0 10296 1.1009 0.7401
0.2923 34.0 10608 0.8986 0.7545
0.2923 35.0 10920 1.2732 0.7509
0.2876 36.0 11232 1.0246 0.7437
0.2751 37.0 11544 1.0842 0.7545
0.2751 38.0 11856 1.3797 0.7401
0.2807 39.0 12168 1.2845 0.7401
0.2807 40.0 12480 1.0588 0.7473
0.2524 41.0 12792 1.3290 0.7365
0.2353 42.0 13104 1.1838 0.7509
0.2353 43.0 13416 1.6934 0.7292
0.2221 44.0 13728 1.4884 0.7437
0.222 45.0 14040 1.4472 0.7292
0.222 46.0 14352 1.6685 0.7365
0.2124 47.0 14664 1.2194 0.7545
0.2124 48.0 14976 1.4803 0.7437
0.1923 49.0 15288 1.3954 0.7509
0.1717 50.0 15600 1.4008 0.7401
0.1717 51.0 15912 1.2478 0.7545
0.1775 52.0 16224 1.2562 0.7545
0.1599 53.0 16536 1.4865 0.7545
0.1599 54.0 16848 1.3985 0.7473
0.1518 55.0 17160 1.3492 0.7437
0.1518 56.0 17472 1.3659 0.7437
0.1481 57.0 17784 1.2743 0.7545
0.1461 58.0 18096 1.3666 0.7509
0.1461 59.0 18408 1.3473 0.7509
0.1449 60.0 18720 1.3551 0.7545

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
5

Dataset used to train dkqjrm/20230823213639