Edit model card

5e-3_10_0.1

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6700
  • Accuracy: 0.7365

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.005
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 312 0.9081 0.5271
0.937 2.0 624 0.6140 0.5704
0.937 3.0 936 0.8444 0.4729
0.8284 4.0 1248 0.7307 0.6245
0.8066 5.0 1560 1.2493 0.5487
0.8066 6.0 1872 0.6752 0.6643
0.7461 7.0 2184 0.8410 0.6282
0.7461 8.0 2496 0.7924 0.6390
0.6874 9.0 2808 0.6100 0.7184
0.67 10.0 3120 0.7658 0.6895
0.67 11.0 3432 0.8649 0.6426
0.6374 12.0 3744 0.5784 0.7545
0.5735 13.0 4056 0.5793 0.7292
0.5735 14.0 4368 0.6332 0.7437
0.4712 15.0 4680 0.5207 0.7581
0.4712 16.0 4992 0.5339 0.7292
0.4258 17.0 5304 0.7625 0.7220
0.3712 18.0 5616 0.5492 0.7365
0.3712 19.0 5928 0.5661 0.7437
0.3656 20.0 6240 0.7445 0.7184
0.327 21.0 6552 0.5874 0.7437
0.327 22.0 6864 0.6301 0.7365
0.3015 23.0 7176 0.6740 0.7148
0.3015 24.0 7488 0.6599 0.7220
0.2929 25.0 7800 0.6697 0.7292
0.2609 26.0 8112 0.6871 0.7256
0.2609 27.0 8424 0.6303 0.7220
0.2581 28.0 8736 0.6768 0.7040
0.2504 29.0 9048 0.6986 0.7148
0.2504 30.0 9360 0.6783 0.7148
0.2313 31.0 9672 0.7120 0.7076
0.2313 32.0 9984 0.6227 0.7148
0.2209 33.0 10296 0.6961 0.7220
0.2141 34.0 10608 0.6817 0.7220
0.2141 35.0 10920 0.6810 0.7256
0.2129 36.0 11232 0.6567 0.7292
0.2053 37.0 11544 0.7469 0.7329
0.2053 38.0 11856 0.6684 0.7329
0.2014 39.0 12168 0.6540 0.7329
0.2014 40.0 12480 0.6679 0.7437
0.2012 41.0 12792 0.6582 0.7292
0.1957 42.0 13104 0.6635 0.7292
0.1957 43.0 13416 0.6715 0.7401
0.1903 44.0 13728 0.6628 0.7329
0.1861 45.0 14040 0.6674 0.7329
0.1861 46.0 14352 0.7008 0.7220
0.1858 47.0 14664 0.6371 0.7401
0.1858 48.0 14976 0.6630 0.7437
0.1852 49.0 15288 0.6353 0.7365
0.1868 50.0 15600 0.7010 0.7401
0.1868 51.0 15912 0.6572 0.7365
0.1813 52.0 16224 0.6531 0.7401
0.1807 53.0 16536 0.6413 0.7437
0.1807 54.0 16848 0.6605 0.7473
0.1792 55.0 17160 0.6498 0.7437
0.1792 56.0 17472 0.6865 0.7437
0.1764 57.0 17784 0.6660 0.7365
0.1726 58.0 18096 0.6829 0.7473
0.1726 59.0 18408 0.6730 0.7437
0.1761 60.0 18720 0.6700 0.7365

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
1

Dataset used to train Onutoa/5e-3_10_0.1