Edit model card

20230822145721

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3478
  • Accuracy: 0.5271

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 312 0.3504 0.5235
0.3893 2.0 624 0.3582 0.4729
0.3893 3.0 936 0.3531 0.5271
0.3878 4.0 1248 0.3627 0.4729
0.3764 5.0 1560 0.3488 0.5271
0.3764 6.0 1872 0.3529 0.5271
0.3735 7.0 2184 0.3598 0.5271
0.3735 8.0 2496 0.3609 0.5271
0.3703 9.0 2808 0.3605 0.4729
0.3684 10.0 3120 0.3562 0.5271
0.3684 11.0 3432 0.4032 0.4729
0.3687 12.0 3744 0.3752 0.4729
0.3667 13.0 4056 0.3566 0.4729
0.3667 14.0 4368 0.3499 0.5271
0.3689 15.0 4680 0.3503 0.5271
0.3689 16.0 4992 0.3539 0.5271
0.3663 17.0 5304 0.3485 0.5271
0.3677 18.0 5616 0.3617 0.5271
0.3677 19.0 5928 0.3666 0.4729
0.3716 20.0 6240 0.3562 0.5271
0.3671 21.0 6552 0.3573 0.5271
0.3671 22.0 6864 0.3900 0.5271
0.3642 23.0 7176 0.3554 0.5271
0.3642 24.0 7488 0.3594 0.4729
0.3649 25.0 7800 0.3498 0.5271
0.3639 26.0 8112 0.3646 0.4729
0.3639 27.0 8424 0.3498 0.5271
0.3615 28.0 8736 0.3504 0.5271
0.3606 29.0 9048 0.3485 0.5271
0.3606 30.0 9360 0.3479 0.5271
0.3623 31.0 9672 0.3498 0.5271
0.3623 32.0 9984 0.3478 0.5271
0.3623 33.0 10296 0.3545 0.5271
0.3603 34.0 10608 0.3483 0.5271
0.3603 35.0 10920 0.3481 0.5271
0.3604 36.0 11232 0.3495 0.5271
0.3586 37.0 11544 0.3507 0.5271
0.3586 38.0 11856 0.3486 0.5271
0.3593 39.0 12168 0.3492 0.5271
0.3593 40.0 12480 0.3492 0.5271
0.359 41.0 12792 0.3485 0.5271
0.3584 42.0 13104 0.3579 0.4729
0.3584 43.0 13416 0.3480 0.5271
0.3606 44.0 13728 0.3479 0.5271
0.3568 45.0 14040 0.3530 0.5271
0.3568 46.0 14352 0.3499 0.5271
0.3589 47.0 14664 0.3547 0.4729
0.3589 48.0 14976 0.3499 0.5271
0.3589 49.0 15288 0.3478 0.5271
0.3573 50.0 15600 0.3481 0.5271
0.3573 51.0 15912 0.3487 0.5271
0.3569 52.0 16224 0.3481 0.5271
0.3572 53.0 16536 0.3480 0.5271
0.3572 54.0 16848 0.3481 0.5271
0.3558 55.0 17160 0.3478 0.5271
0.3558 56.0 17472 0.3479 0.5271
0.3557 57.0 17784 0.3484 0.5271
0.3558 58.0 18096 0.3478 0.5271
0.3558 59.0 18408 0.3478 0.5271
0.3548 60.0 18720 0.3478 0.5271

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
1

Dataset used to train dkqjrm/20230822145721