20230822135401

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3478
  • Accuracy: 0.6065

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 312 0.3502 0.5451
0.3914 2.0 624 0.3937 0.4729
0.3914 3.0 936 0.3710 0.4729
0.3806 4.0 1248 0.3529 0.4693
0.3775 5.0 1560 0.3489 0.5487
0.3775 6.0 1872 0.3466 0.5451
0.3668 7.0 2184 0.4554 0.5379
0.3668 8.0 2496 0.3811 0.5451
0.3698 9.0 2808 0.3497 0.5271
0.3659 10.0 3120 0.3462 0.5199
0.3659 11.0 3432 0.4239 0.4729
0.3675 12.0 3744 0.3535 0.5126
0.3617 13.0 4056 0.3470 0.5090
0.3617 14.0 4368 0.3630 0.5054
0.3624 15.0 4680 0.3506 0.5235
0.3624 16.0 4992 0.3747 0.5487
0.359 17.0 5304 0.3704 0.5487
0.3576 18.0 5616 0.3538 0.5343
0.3576 19.0 5928 0.3597 0.5415
0.3612 20.0 6240 0.3637 0.5596
0.359 21.0 6552 0.3487 0.5704
0.359 22.0 6864 0.3591 0.5415
0.3566 23.0 7176 0.3946 0.5523
0.3566 24.0 7488 0.3627 0.5018
0.3551 25.0 7800 0.3540 0.5523
0.353 26.0 8112 0.3461 0.5343
0.353 27.0 8424 0.3469 0.5596
0.3517 28.0 8736 0.3471 0.5993
0.3549 29.0 9048 0.3504 0.5632
0.3549 30.0 9360 0.3559 0.5812
0.3523 31.0 9672 0.3769 0.5560
0.3523 32.0 9984 0.3473 0.5704
0.3514 33.0 10296 0.3632 0.5704
0.3513 34.0 10608 0.3503 0.5848
0.3513 35.0 10920 0.3464 0.5560
0.3512 36.0 11232 0.3493 0.5740
0.3494 37.0 11544 0.3479 0.6101
0.3494 38.0 11856 0.3464 0.6029
0.3478 39.0 12168 0.3495 0.6101
0.3478 40.0 12480 0.3462 0.6065
0.3479 41.0 12792 0.3519 0.6065
0.3472 42.0 13104 0.3420 0.5704
0.3472 43.0 13416 0.3555 0.5740
0.3456 44.0 13728 0.3471 0.5957
0.3448 45.0 14040 0.3434 0.5776
0.3448 46.0 14352 0.3401 0.6209
0.3439 47.0 14664 0.3439 0.5776
0.3439 48.0 14976 0.3523 0.5921
0.3442 49.0 15288 0.3466 0.6137
0.3437 50.0 15600 0.3549 0.5776
0.3437 51.0 15912 0.3417 0.6173
0.3413 52.0 16224 0.3409 0.6209
0.3416 53.0 16536 0.3607 0.5884
0.3416 54.0 16848 0.3574 0.5848
0.3401 55.0 17160 0.3494 0.5812
0.3401 56.0 17472 0.3480 0.6137
0.3395 57.0 17784 0.3434 0.6029
0.3399 58.0 18096 0.3454 0.5993
0.3399 59.0 18408 0.3477 0.5957
0.3398 60.0 18720 0.3478 0.6065

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
7
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train dkqjrm/20230822135401