Edit model card

4e-3_10_0.1

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6572
  • Accuracy: 0.7545

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.004
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 312 1.1789 0.5271
0.9223 2.0 624 0.8795 0.4729
0.9223 3.0 936 0.6489 0.5668
0.816 4.0 1248 0.6147 0.5632
0.8543 5.0 1560 0.6493 0.6534
0.8543 6.0 1872 0.9731 0.6137
0.7269 7.0 2184 0.9628 0.6029
0.7269 8.0 2496 0.7051 0.6751
0.6757 9.0 2808 0.6159 0.7184
0.649 10.0 3120 0.9342 0.5993
0.649 11.0 3432 0.6097 0.6931
0.6568 12.0 3744 0.6755 0.7004
0.5909 13.0 4056 0.6391 0.7004
0.5909 14.0 4368 0.6791 0.7329
0.543 15.0 4680 0.5279 0.7076
0.543 16.0 4992 0.6385 0.6787
0.4908 17.0 5304 0.7443 0.6931
0.4347 18.0 5616 0.5453 0.7365
0.4347 19.0 5928 0.5740 0.7401
0.4282 20.0 6240 0.7645 0.7256
0.3796 21.0 6552 0.6200 0.7329
0.3796 22.0 6864 0.5916 0.7509
0.3584 23.0 7176 0.6890 0.7545
0.3584 24.0 7488 0.6155 0.7329
0.3471 25.0 7800 0.6455 0.7473
0.3148 26.0 8112 0.6069 0.7545
0.3148 27.0 8424 0.6410 0.7401
0.317 28.0 8736 0.6373 0.7473
0.2959 29.0 9048 0.5946 0.7545
0.2959 30.0 9360 0.6236 0.7545
0.2748 31.0 9672 0.6449 0.7473
0.2748 32.0 9984 0.5963 0.7473
0.2687 33.0 10296 0.6619 0.7401
0.2561 34.0 10608 0.7464 0.7473
0.2561 35.0 10920 0.6339 0.7581
0.2478 36.0 11232 0.6020 0.7509
0.2426 37.0 11544 0.7438 0.7329
0.2426 38.0 11856 0.5934 0.7581
0.2339 39.0 12168 0.6048 0.7581
0.2339 40.0 12480 0.6533 0.7545
0.2252 41.0 12792 0.6122 0.7617
0.2179 42.0 13104 0.6366 0.7762
0.2179 43.0 13416 0.6808 0.7256
0.2232 44.0 13728 0.6474 0.7581
0.214 45.0 14040 0.6993 0.7545
0.214 46.0 14352 0.6351 0.7545
0.2085 47.0 14664 0.6343 0.7509
0.2085 48.0 14976 0.5988 0.7726
0.2059 49.0 15288 0.6607 0.7581
0.2084 50.0 15600 0.6370 0.7581
0.2084 51.0 15912 0.6143 0.7653
0.2018 52.0 16224 0.6106 0.7545
0.2032 53.0 16536 0.6739 0.7473
0.2032 54.0 16848 0.6540 0.7545
0.1993 55.0 17160 0.6367 0.7545
0.1993 56.0 17472 0.6510 0.7545
0.1964 57.0 17784 0.6427 0.7617
0.1877 58.0 18096 0.6658 0.7581
0.1877 59.0 18408 0.6553 0.7581
0.1895 60.0 18720 0.6572 0.7545

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
4
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train Onutoa/4e-3_10_0.1