20230817123430

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3373
  • Accuracy: 0.7437

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.003
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 312 0.3735 0.5054
0.5142 2.0 624 0.4848 0.5415
0.5142 3.0 936 0.3802 0.5379
0.4859 4.0 1248 0.4823 0.4729
0.4412 5.0 1560 0.3902 0.5379
0.4412 6.0 1872 0.3744 0.5596
0.4418 7.0 2184 0.4612 0.5487
0.4418 8.0 2496 0.4590 0.4729
0.4467 9.0 2808 0.4777 0.4729
0.4177 10.0 3120 0.3616 0.4838
0.4177 11.0 3432 0.3736 0.6245
0.3988 12.0 3744 0.3464 0.5993
0.3911 13.0 4056 0.3522 0.6282
0.3911 14.0 4368 0.3406 0.6859
0.3893 15.0 4680 0.4223 0.6570
0.3893 16.0 4992 0.6759 0.5415
0.38 17.0 5304 0.3631 0.6823
0.3772 18.0 5616 0.3434 0.6931
0.3772 19.0 5928 0.3344 0.6137
0.3639 20.0 6240 0.3670 0.6968
0.336 21.0 6552 0.3483 0.6895
0.336 22.0 6864 0.3485 0.7148
0.3369 23.0 7176 0.3541 0.7184
0.3369 24.0 7488 0.3346 0.7112
0.3291 25.0 7800 0.3387 0.7365
0.3228 26.0 8112 0.3492 0.7220
0.3228 27.0 8424 0.3334 0.7040
0.3206 28.0 8736 0.3388 0.7401
0.3189 29.0 9048 0.3304 0.7365
0.3189 30.0 9360 0.3566 0.7292
0.3148 31.0 9672 0.3370 0.7329
0.3148 32.0 9984 0.3328 0.7292
0.31 33.0 10296 0.3422 0.7437
0.306 34.0 10608 0.3339 0.7292
0.306 35.0 10920 0.3254 0.7292
0.3032 36.0 11232 0.3330 0.7473
0.3028 37.0 11544 0.3718 0.7184
0.3028 38.0 11856 0.3294 0.7473
0.3005 39.0 12168 0.3465 0.7329
0.3005 40.0 12480 0.3334 0.7292
0.2965 41.0 12792 0.3239 0.7256
0.2947 42.0 13104 0.3322 0.7329
0.2947 43.0 13416 0.3370 0.7401
0.2909 44.0 13728 0.3385 0.7473
0.2915 45.0 14040 0.3365 0.7329
0.2915 46.0 14352 0.3435 0.7365
0.29 47.0 14664 0.3301 0.7437
0.29 48.0 14976 0.3443 0.7401
0.2872 49.0 15288 0.3393 0.7437
0.2838 50.0 15600 0.3291 0.7437
0.2838 51.0 15912 0.3356 0.7401
0.2865 52.0 16224 0.3307 0.7365
0.2823 53.0 16536 0.3413 0.7401
0.2823 54.0 16848 0.3353 0.7437
0.28 55.0 17160 0.3315 0.7365
0.28 56.0 17472 0.3433 0.7365
0.2832 57.0 17784 0.3338 0.7401
0.2794 58.0 18096 0.3367 0.7401
0.2794 59.0 18408 0.3371 0.7401
0.2785 60.0 18720 0.3373 0.7437

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
15
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train Onutoa/20230817123430