20230821154607

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3385
  • Accuracy: 0.7437

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.004
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 312 0.3899 0.5271
0.5615 2.0 624 0.3545 0.5596
0.5615 3.0 936 0.5571 0.4729
0.4381 4.0 1248 0.3457 0.5379
0.4338 5.0 1560 0.3504 0.5704
0.4338 6.0 1872 0.4047 0.5596
0.4327 7.0 2184 0.3446 0.6065
0.4327 8.0 2496 0.3317 0.6859
0.3696 9.0 2808 0.3344 0.6751
0.3503 10.0 3120 0.3280 0.7292
0.3503 11.0 3432 0.3260 0.6895
0.3459 12.0 3744 0.3253 0.7040
0.3338 13.0 4056 0.3294 0.6895
0.3338 14.0 4368 0.3428 0.6895
0.3271 15.0 4680 0.3216 0.6931
0.3271 16.0 4992 0.3505 0.6787
0.322 17.0 5304 0.3411 0.7148
0.3152 18.0 5616 0.3221 0.7004
0.3152 19.0 5928 0.3259 0.7292
0.3141 20.0 6240 0.3706 0.6570
0.3026 21.0 6552 0.3651 0.6895
0.3026 22.0 6864 0.3609 0.6895
0.3009 23.0 7176 0.3537 0.7076
0.3009 24.0 7488 0.3329 0.7401
0.2977 25.0 7800 0.3269 0.7329
0.2913 26.0 8112 0.3431 0.7292
0.2913 27.0 8424 0.3236 0.7256
0.2898 28.0 8736 0.3209 0.7184
0.2862 29.0 9048 0.3299 0.7329
0.2862 30.0 9360 0.3527 0.7329
0.2812 31.0 9672 0.3402 0.7256
0.2812 32.0 9984 0.3236 0.7437
0.2793 33.0 10296 0.3509 0.7581
0.2692 34.0 10608 0.3250 0.7509
0.2692 35.0 10920 0.3340 0.7473
0.2696 36.0 11232 0.3267 0.7401
0.2668 37.0 11544 0.3485 0.7437
0.2668 38.0 11856 0.3355 0.7509
0.2641 39.0 12168 0.3305 0.7473
0.2641 40.0 12480 0.3309 0.7437
0.2616 41.0 12792 0.3252 0.7509
0.2612 42.0 13104 0.3285 0.7545
0.2612 43.0 13416 0.3412 0.7545
0.2569 44.0 13728 0.3383 0.7437
0.2559 45.0 14040 0.3340 0.7437
0.2559 46.0 14352 0.3475 0.7401
0.2532 47.0 14664 0.3325 0.7401
0.2532 48.0 14976 0.3355 0.7473
0.2508 49.0 15288 0.3478 0.7401
0.2475 50.0 15600 0.3290 0.7365
0.2475 51.0 15912 0.3432 0.7401
0.2488 52.0 16224 0.3493 0.7329
0.2462 53.0 16536 0.3472 0.7437
0.2462 54.0 16848 0.3351 0.7401
0.2456 55.0 17160 0.3470 0.7401
0.2456 56.0 17472 0.3390 0.7401
0.2455 57.0 17784 0.3416 0.7401
0.2433 58.0 18096 0.3366 0.7437
0.2433 59.0 18408 0.3382 0.7437
0.2431 60.0 18720 0.3385 0.7437

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
14
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train Onutoa/20230821154607