Edit model card

1e-2_10_0.1

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6265
  • Accuracy: 0.5126

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.01
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 312 0.6257 0.5307
1.3002 2.0 624 1.0407 0.5271
1.3002 3.0 936 1.4050 0.5271
1.0663 4.0 1248 0.9796 0.5271
1.0554 5.0 1560 1.4166 0.5271
1.0554 6.0 1872 0.9151 0.5271
0.8523 7.0 2184 0.8469 0.5271
0.8523 8.0 2496 0.8390 0.5271
0.8445 9.0 2808 0.7439 0.4729
0.8722 10.0 3120 0.6458 0.5343
0.8722 11.0 3432 0.7906 0.4729
0.8432 12.0 3744 0.6429 0.4946
0.7932 13.0 4056 0.6503 0.5307
0.7932 14.0 4368 0.7167 0.5271
0.7687 15.0 4680 0.6584 0.4765
0.7687 16.0 4992 0.6324 0.4874
0.7569 17.0 5304 0.7912 0.5271
0.7369 18.0 5616 0.7309 0.4729
0.7369 19.0 5928 0.6402 0.5126
0.7632 20.0 6240 0.7055 0.5271
0.7321 21.0 6552 0.6247 0.5271
0.7321 22.0 6864 0.7055 0.5271
0.7151 23.0 7176 0.6276 0.5343
0.7151 24.0 7488 0.6245 0.5271
0.7092 25.0 7800 0.6266 0.5126
0.7311 26.0 8112 0.6983 0.5271
0.7311 27.0 8424 0.6762 0.4729
0.7027 28.0 8736 0.6316 0.5018
0.7007 29.0 9048 0.6505 0.4729
0.7007 30.0 9360 0.7682 0.5271
0.6974 31.0 9672 0.6616 0.5271
0.6974 32.0 9984 0.6322 0.5271
0.6974 33.0 10296 0.6302 0.5271
0.6786 34.0 10608 0.6764 0.4729
0.6786 35.0 10920 0.6569 0.4729
0.692 36.0 11232 0.6584 0.4729
0.6814 37.0 11544 0.6636 0.5271
0.6814 38.0 11856 0.6477 0.4729
0.6767 39.0 12168 0.6294 0.5271
0.6767 40.0 12480 0.6487 0.4585
0.6762 41.0 12792 0.6301 0.5307
0.6682 42.0 13104 0.6252 0.5271
0.6682 43.0 13416 0.6249 0.5271
0.6738 44.0 13728 0.6334 0.5271
0.667 45.0 14040 0.6248 0.5271
0.667 46.0 14352 0.6390 0.5090
0.6633 47.0 14664 0.6622 0.4729
0.6633 48.0 14976 0.6267 0.4874
0.6573 49.0 15288 0.6256 0.5271
0.6559 50.0 15600 0.6306 0.4838
0.6559 51.0 15912 0.6412 0.4729
0.6455 52.0 16224 0.6634 0.4729
0.6484 53.0 16536 0.6247 0.5271
0.6484 54.0 16848 0.6267 0.5271
0.6417 55.0 17160 0.6295 0.4838
0.6417 56.0 17472 0.6256 0.5271
0.6395 57.0 17784 0.6268 0.4946
0.6418 58.0 18096 0.6267 0.4838
0.6418 59.0 18408 0.6260 0.5271
0.6373 60.0 18720 0.6265 0.5126

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
4

Dataset used to train Onutoa/1e-2_10_0.1