20230817093322

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3504
  • Accuracy: 0.7256

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.003
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 312 0.5126 0.5235
0.5126 2.0 624 0.3824 0.4765
0.5126 3.0 936 0.3692 0.4910
0.4613 4.0 1248 0.3941 0.5343
0.446 5.0 1560 0.6773 0.5271
0.446 6.0 1872 0.5516 0.5271
0.4477 7.0 2184 0.3517 0.5199
0.4477 8.0 2496 0.3772 0.4910
0.4263 9.0 2808 0.3690 0.4838
0.4397 10.0 3120 0.3512 0.4838
0.4397 11.0 3432 0.4716 0.5379
0.4425 12.0 3744 0.3605 0.6570
0.4269 13.0 4056 0.3571 0.5379
0.4269 14.0 4368 0.3545 0.4838
0.3975 15.0 4680 0.3744 0.6498
0.3975 16.0 4992 0.3578 0.6606
0.3906 17.0 5304 0.3704 0.6931
0.3633 18.0 5616 0.3356 0.6065
0.3633 19.0 5928 0.3397 0.6065
0.3604 20.0 6240 0.3809 0.6931
0.3565 21.0 6552 0.3357 0.6787
0.3565 22.0 6864 0.3803 0.6209
0.3533 23.0 7176 0.3754 0.6751
0.3533 24.0 7488 0.3304 0.6354
0.3462 25.0 7800 0.3700 0.6968
0.3432 26.0 8112 0.3337 0.7148
0.3432 27.0 8424 0.3289 0.6968
0.3409 28.0 8736 0.3340 0.7148
0.3381 29.0 9048 0.3467 0.7220
0.3381 30.0 9360 0.3860 0.6823
0.337 31.0 9672 0.3795 0.6931
0.337 32.0 9984 0.3755 0.7184
0.334 33.0 10296 0.3529 0.7112
0.3321 34.0 10608 0.3389 0.7076
0.3321 35.0 10920 0.3260 0.7148
0.3315 36.0 11232 0.3519 0.7329
0.3317 37.0 11544 0.3741 0.6968
0.3317 38.0 11856 0.3364 0.7112
0.325 39.0 12168 0.3438 0.7256
0.325 40.0 12480 0.3462 0.7148
0.3282 41.0 12792 0.3344 0.7256
0.3251 42.0 13104 0.3280 0.7256
0.3251 43.0 13416 0.3544 0.7148
0.3223 44.0 13728 0.3488 0.7256
0.3215 45.0 14040 0.3437 0.7220
0.3215 46.0 14352 0.3430 0.7220
0.3205 47.0 14664 0.3394 0.7076
0.3205 48.0 14976 0.3676 0.7076
0.3163 49.0 15288 0.3487 0.7365
0.3154 50.0 15600 0.3387 0.7148
0.3154 51.0 15912 0.3448 0.7076
0.3164 52.0 16224 0.3361 0.7220
0.3153 53.0 16536 0.3676 0.7040
0.3153 54.0 16848 0.3463 0.7256
0.3145 55.0 17160 0.3491 0.7329
0.3145 56.0 17472 0.3599 0.7040
0.3151 57.0 17784 0.3457 0.7292
0.3103 58.0 18096 0.3489 0.7220
0.3103 59.0 18408 0.3481 0.7256
0.314 60.0 18720 0.3504 0.7256

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
18
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train Onutoa/20230817093322