Edit model card

20230824062849

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 1.2256
  • Accuracy: 0.7473

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.003
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 312 1.2170 0.5307
0.9844 2.0 624 0.7365 0.5090
0.9844 3.0 936 0.6978 0.5632
0.8956 4.0 1248 0.8855 0.4765
0.8957 5.0 1560 1.0223 0.5379
0.8957 6.0 1872 0.6873 0.6137
0.7665 7.0 2184 0.8629 0.6173
0.7665 8.0 2496 0.6861 0.6570
0.734 9.0 2808 0.6714 0.7076
0.7238 10.0 3120 0.6298 0.7184
0.7238 11.0 3432 0.5975 0.7184
0.6786 12.0 3744 0.8311 0.6968
0.6396 13.0 4056 0.7136 0.6751
0.6396 14.0 4368 0.7183 0.6859
0.6481 15.0 4680 0.6652 0.7076
0.6481 16.0 4992 1.0367 0.6823
0.6106 17.0 5304 0.7197 0.6895
0.6011 18.0 5616 0.6058 0.7292
0.6011 19.0 5928 0.7227 0.7112
0.5978 20.0 6240 1.1472 0.6570
0.5309 21.0 6552 0.6741 0.7256
0.5309 22.0 6864 0.9335 0.6787
0.5392 23.0 7176 0.8296 0.7365
0.5392 24.0 7488 0.9097 0.7040
0.5058 25.0 7800 0.8278 0.7292
0.4669 26.0 8112 1.0859 0.6498
0.4669 27.0 8424 0.9387 0.7184
0.462 28.0 8736 1.0893 0.7365
0.4757 29.0 9048 1.3568 0.6859
0.4757 30.0 9360 1.0252 0.7040
0.4237 31.0 9672 1.0489 0.7329
0.4237 32.0 9984 0.8661 0.7292
0.4275 33.0 10296 0.9781 0.7437
0.3722 34.0 10608 0.8879 0.7329
0.3722 35.0 10920 0.9932 0.7292
0.3741 36.0 11232 1.0509 0.7365
0.3358 37.0 11544 1.3875 0.7329
0.3358 38.0 11856 1.2366 0.7220
0.3415 39.0 12168 1.0563 0.7329
0.3415 40.0 12480 0.9688 0.7401
0.3357 41.0 12792 0.8598 0.7329
0.3094 42.0 13104 1.0506 0.7329
0.3094 43.0 13416 1.3257 0.7365
0.2947 44.0 13728 1.1759 0.7365
0.2832 45.0 14040 1.1699 0.7329
0.2832 46.0 14352 1.1070 0.7401
0.2808 47.0 14664 1.1519 0.7473
0.2808 48.0 14976 1.0674 0.7401
0.2715 49.0 15288 1.1491 0.7401
0.252 50.0 15600 1.0819 0.7473
0.252 51.0 15912 0.9650 0.7473
0.2577 52.0 16224 1.0753 0.7437
0.2579 53.0 16536 1.0896 0.7473
0.2579 54.0 16848 1.0579 0.7401
0.2395 55.0 17160 1.1172 0.7509
0.2395 56.0 17472 1.1540 0.7509
0.2392 57.0 17784 1.2162 0.7509
0.22 58.0 18096 1.1978 0.7509
0.22 59.0 18408 1.2381 0.7473
0.2242 60.0 18720 1.2256 0.7473

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
1

Dataset used to train dkqjrm/20230824062849