Edit model card

20230822173808

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3493
  • Accuracy: 0.6968

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.004
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 312 0.3774 0.5162
0.5343 2.0 624 0.3506 0.5018
0.5343 3.0 936 0.4575 0.4729
0.4659 4.0 1248 0.3759 0.5307
0.4691 5.0 1560 0.3500 0.5812
0.4691 6.0 1872 0.3457 0.5993
0.4442 7.0 2184 0.3500 0.6101
0.4442 8.0 2496 0.3403 0.6173
0.4366 9.0 2808 0.3840 0.5776
0.4097 10.0 3120 0.4391 0.5487
0.4097 11.0 3432 0.3584 0.6029
0.3922 12.0 3744 0.3356 0.6498
0.3564 13.0 4056 0.3275 0.6931
0.3564 14.0 4368 0.3283 0.7076
0.3343 15.0 4680 0.3377 0.6462
0.3343 16.0 4992 0.3550 0.6390
0.335 17.0 5304 0.3370 0.6895
0.3233 18.0 5616 0.3256 0.6787
0.3233 19.0 5928 0.3174 0.7112
0.3232 20.0 6240 0.3440 0.6643
0.3102 21.0 6552 0.3375 0.6895
0.3102 22.0 6864 0.3433 0.6787
0.3064 23.0 7176 0.3690 0.6715
0.3064 24.0 7488 0.3394 0.6931
0.3004 25.0 7800 0.3377 0.7256
0.2962 26.0 8112 0.3435 0.6751
0.2962 27.0 8424 0.3182 0.7329
0.2937 28.0 8736 0.3306 0.7112
0.2905 29.0 9048 0.3362 0.7148
0.2905 30.0 9360 0.3675 0.6751
0.2865 31.0 9672 0.3406 0.7076
0.2865 32.0 9984 0.3343 0.7040
0.2812 33.0 10296 0.3472 0.6859
0.2727 34.0 10608 0.3372 0.7292
0.2727 35.0 10920 0.3575 0.7076
0.2735 36.0 11232 0.3300 0.7076
0.2701 37.0 11544 0.3585 0.6968
0.2701 38.0 11856 0.3422 0.7148
0.2688 39.0 12168 0.3579 0.6931
0.2688 40.0 12480 0.3326 0.7148
0.2644 41.0 12792 0.3464 0.7256
0.2637 42.0 13104 0.3579 0.6931
0.2637 43.0 13416 0.3489 0.7040
0.26 44.0 13728 0.3439 0.7076
0.2582 45.0 14040 0.3585 0.7004
0.2582 46.0 14352 0.3535 0.7076
0.2533 47.0 14664 0.3440 0.7148
0.2533 48.0 14976 0.3506 0.7040
0.2535 49.0 15288 0.3519 0.7040
0.2498 50.0 15600 0.3457 0.6931
0.2498 51.0 15912 0.3494 0.7112
0.2504 52.0 16224 0.3431 0.7040
0.2499 53.0 16536 0.3450 0.7040
0.2499 54.0 16848 0.3485 0.6895
0.2488 55.0 17160 0.3437 0.7004
0.2488 56.0 17472 0.3465 0.7004
0.2479 57.0 17784 0.3479 0.6895
0.247 58.0 18096 0.3447 0.7004
0.247 59.0 18408 0.3521 0.7004
0.2468 60.0 18720 0.3493 0.6968

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
4
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train dkqjrm/20230822173808