Edit model card

20230820105148

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3319
  • Accuracy: 0.7292

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.003
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 312 0.4103 0.5271
0.4925 2.0 624 0.4974 0.5451
0.4925 3.0 936 0.3594 0.5704
0.4459 4.0 1248 0.4183 0.4693
0.44 5.0 1560 0.5487 0.5271
0.44 6.0 1872 0.3475 0.5379
0.4177 7.0 2184 0.6254 0.5271
0.4177 8.0 2496 0.3665 0.5884
0.3945 9.0 2808 0.4198 0.4982
0.4112 10.0 3120 0.3320 0.6823
0.4112 11.0 3432 0.3367 0.6173
0.359 12.0 3744 0.3249 0.6931
0.3421 13.0 4056 0.3311 0.6679
0.3421 14.0 4368 0.3228 0.6968
0.3351 15.0 4680 0.3210 0.7148
0.3351 16.0 4992 0.3376 0.6787
0.3289 17.0 5304 0.3285 0.6895
0.3761 18.0 5616 0.3637 0.4801
0.3761 19.0 5928 0.3538 0.5415
0.3983 20.0 6240 0.3642 0.5307
0.3472 21.0 6552 0.3444 0.6931
0.3472 22.0 6864 0.3312 0.7040
0.3194 23.0 7176 0.3450 0.6751
0.3194 24.0 7488 0.3325 0.6823
0.314 25.0 7800 0.3312 0.7220
0.3081 26.0 8112 0.3333 0.7040
0.3081 27.0 8424 0.3184 0.7184
0.3084 28.0 8736 0.3162 0.7112
0.3058 29.0 9048 0.3241 0.7184
0.3058 30.0 9360 0.3549 0.6751
0.3033 31.0 9672 0.3269 0.7184
0.3033 32.0 9984 0.3243 0.7004
0.3 33.0 10296 0.3370 0.7220
0.2906 34.0 10608 0.3198 0.7292
0.2906 35.0 10920 0.3237 0.7148
0.2934 36.0 11232 0.3207 0.7112
0.2921 37.0 11544 0.3450 0.7076
0.2921 38.0 11856 0.3338 0.7112
0.2873 39.0 12168 0.3207 0.7220
0.2873 40.0 12480 0.3233 0.7329
0.2861 41.0 12792 0.3212 0.7148
0.2852 42.0 13104 0.3255 0.7112
0.2852 43.0 13416 0.3353 0.7256
0.2787 44.0 13728 0.3332 0.7220
0.2796 45.0 14040 0.3427 0.7220
0.2796 46.0 14352 0.3407 0.7256
0.2759 47.0 14664 0.3203 0.7256
0.2759 48.0 14976 0.3333 0.7220
0.2761 49.0 15288 0.3283 0.7401
0.2734 50.0 15600 0.3187 0.7292
0.2734 51.0 15912 0.3298 0.7365
0.274 52.0 16224 0.3276 0.7401
0.2717 53.0 16536 0.3342 0.7292
0.2717 54.0 16848 0.3322 0.7292
0.2686 55.0 17160 0.3277 0.7329
0.2686 56.0 17472 0.3357 0.7292
0.2699 57.0 17784 0.3334 0.7365
0.2664 58.0 18096 0.3303 0.7292
0.2664 59.0 18408 0.3320 0.7292
0.2672 60.0 18720 0.3319 0.7292

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
0

Dataset used to train Onutoa/20230820105148