20230820161846

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3402
  • Accuracy: 0.7401

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.005
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 312 0.3514 0.5560
0.6229 2.0 624 0.6273 0.5487
0.6229 3.0 936 0.8085 0.4729
0.5188 4.0 1248 0.6060 0.4729
0.4693 5.0 1560 0.3607 0.4946
0.4693 6.0 1872 0.3897 0.4801
0.4249 7.0 2184 0.5828 0.5271
0.4249 8.0 2496 0.4718 0.5307
0.4226 9.0 2808 0.5343 0.4729
0.42 10.0 3120 0.3478 0.5451
0.42 11.0 3432 0.4042 0.5271
0.407 12.0 3744 0.5783 0.4693
0.4156 13.0 4056 0.3466 0.5740
0.4156 14.0 4368 0.3720 0.5379
0.3784 15.0 4680 0.3414 0.6318
0.3784 16.0 4992 0.3330 0.6318
0.3734 17.0 5304 0.4631 0.5957
0.3573 18.0 5616 0.3375 0.5848
0.3573 19.0 5928 0.3429 0.6606
0.3516 20.0 6240 0.3344 0.6606
0.3399 21.0 6552 0.3671 0.6679
0.3399 22.0 6864 0.3485 0.6643
0.3345 23.0 7176 0.3416 0.6679
0.3345 24.0 7488 0.3263 0.6968
0.325 25.0 7800 0.3331 0.6895
0.3197 26.0 8112 0.3591 0.6787
0.3197 27.0 8424 0.3175 0.7292
0.3165 28.0 8736 0.3208 0.7148
0.3122 29.0 9048 0.3200 0.7292
0.3122 30.0 9360 0.3790 0.6570
0.3072 31.0 9672 0.3221 0.7112
0.3072 32.0 9984 0.3263 0.7365
0.3041 33.0 10296 0.3322 0.7292
0.2885 34.0 10608 0.3296 0.7365
0.2885 35.0 10920 0.3265 0.7220
0.2875 36.0 11232 0.3236 0.7509
0.2848 37.0 11544 0.3484 0.7112
0.2848 38.0 11856 0.3266 0.7365
0.2766 39.0 12168 0.3304 0.7473
0.2766 40.0 12480 0.3305 0.7401
0.2743 41.0 12792 0.3287 0.7545
0.2708 42.0 13104 0.3292 0.7365
0.2708 43.0 13416 0.3363 0.7256
0.2662 44.0 13728 0.3203 0.7329
0.2636 45.0 14040 0.3338 0.7401
0.2636 46.0 14352 0.3480 0.7365
0.261 47.0 14664 0.3282 0.7401
0.261 48.0 14976 0.3330 0.7329
0.2585 49.0 15288 0.3519 0.7292
0.2561 50.0 15600 0.3215 0.7473
0.2561 51.0 15912 0.3388 0.7401
0.2569 52.0 16224 0.3327 0.7365
0.2544 53.0 16536 0.3402 0.7401
0.2544 54.0 16848 0.3313 0.7437
0.2499 55.0 17160 0.3317 0.7401
0.2499 56.0 17472 0.3465 0.7329
0.2505 57.0 17784 0.3398 0.7437
0.2468 58.0 18096 0.3380 0.7437
0.2468 59.0 18408 0.3370 0.7437
0.2487 60.0 18720 0.3402 0.7401

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
12
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train Onutoa/20230820161846