20230822155557

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3488
  • Accuracy: 0.5307

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 312 0.3548 0.4729
0.3737 2.0 624 0.3480 0.5199
0.3737 3.0 936 0.3486 0.5162
0.3718 4.0 1248 0.3495 0.5235
0.3714 5.0 1560 0.3505 0.4729
0.3714 6.0 1872 0.3487 0.5235
0.3686 7.0 2184 0.3496 0.4729
0.3686 8.0 2496 0.3505 0.4729
0.3684 9.0 2808 0.3502 0.5235
0.3679 10.0 3120 0.3491 0.5054
0.3679 11.0 3432 0.3515 0.4729
0.3659 12.0 3744 0.3496 0.5162
0.3649 13.0 4056 0.3517 0.4729
0.3649 14.0 4368 0.3543 0.4729
0.3651 15.0 4680 0.3513 0.4729
0.3651 16.0 4992 0.3489 0.5235
0.363 17.0 5304 0.3537 0.5235
0.3613 18.0 5616 0.3487 0.5307
0.3613 19.0 5928 0.3495 0.5126
0.3645 20.0 6240 0.3530 0.5199
0.359 21.0 6552 0.3497 0.5235
0.359 22.0 6864 0.3487 0.5235
0.3614 23.0 7176 0.3511 0.5235
0.3614 24.0 7488 0.3491 0.5271
0.3617 25.0 7800 0.3493 0.5199
0.3611 26.0 8112 0.3491 0.5271
0.3611 27.0 8424 0.3581 0.4729
0.3583 28.0 8736 0.3496 0.5343
0.3583 29.0 9048 0.3492 0.5162
0.3583 30.0 9360 0.3493 0.4404
0.3564 31.0 9672 0.3494 0.5343
0.3564 32.0 9984 0.3489 0.5199
0.3567 33.0 10296 0.3490 0.5343
0.3561 34.0 10608 0.3486 0.5271
0.3561 35.0 10920 0.3492 0.5307
0.3556 36.0 11232 0.3503 0.4765
0.3556 37.0 11544 0.3497 0.5307
0.3556 38.0 11856 0.3494 0.5379
0.3561 39.0 12168 0.3488 0.5235
0.3561 40.0 12480 0.3503 0.5271
0.3558 41.0 12792 0.3489 0.5343
0.3579 42.0 13104 0.3508 0.4729
0.3579 43.0 13416 0.3505 0.5271
0.3547 44.0 13728 0.3493 0.5379
0.3567 45.0 14040 0.3519 0.4729
0.3567 46.0 14352 0.3497 0.4729
0.3548 47.0 14664 0.3499 0.4729
0.3548 48.0 14976 0.3492 0.5343
0.3563 49.0 15288 0.3491 0.5307
0.3552 50.0 15600 0.3489 0.5235
0.3552 51.0 15912 0.3487 0.5162
0.3557 52.0 16224 0.3496 0.4513
0.3555 53.0 16536 0.3488 0.5307
0.3555 54.0 16848 0.3489 0.5271
0.3542 55.0 17160 0.3488 0.5162
0.3542 56.0 17472 0.3488 0.5343
0.3545 57.0 17784 0.3494 0.5379
0.3543 58.0 18096 0.3489 0.5126
0.3543 59.0 18408 0.3489 0.5162
0.3553 60.0 18720 0.3488 0.5307

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
7
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train dkqjrm/20230822155557