Edit model card

20230822124255

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3479
  • Accuracy: 0.5271

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 312 0.4745 0.5271
0.4082 2.0 624 0.3528 0.5307
0.4082 3.0 936 0.4075 0.4729
0.3905 4.0 1248 0.3634 0.4729
0.3831 5.0 1560 0.3585 0.5271
0.3831 6.0 1872 0.3679 0.5271
0.3797 7.0 2184 0.3550 0.5271
0.3797 8.0 2496 0.4011 0.5271
0.3796 9.0 2808 0.3515 0.5271
0.3836 10.0 3120 0.3478 0.5271
0.3836 11.0 3432 0.3494 0.5271
0.3815 12.0 3744 0.3707 0.4729
0.3769 13.0 4056 0.3625 0.4729
0.3769 14.0 4368 0.3498 0.5271
0.3761 15.0 4680 0.3550 0.4729
0.3761 16.0 4992 0.4420 0.5271
0.3776 17.0 5304 0.3529 0.5271
0.3704 18.0 5616 0.3486 0.5271
0.3704 19.0 5928 0.3670 0.4729
0.3765 20.0 6240 0.3586 0.5271
0.3721 21.0 6552 0.3490 0.5271
0.3721 22.0 6864 0.3729 0.5271
0.3689 23.0 7176 0.3798 0.5271
0.3689 24.0 7488 0.3861 0.4729
0.3698 25.0 7800 0.3498 0.5271
0.369 26.0 8112 0.3698 0.4729
0.369 27.0 8424 0.3507 0.5271
0.3658 28.0 8736 0.3494 0.5271
0.3662 29.0 9048 0.3479 0.5271
0.3662 30.0 9360 0.3504 0.5271
0.3666 31.0 9672 0.3577 0.5271
0.3666 32.0 9984 0.3509 0.5271
0.3637 33.0 10296 0.3483 0.5271
0.3647 34.0 10608 0.3493 0.5271
0.3647 35.0 10920 0.3482 0.5271
0.364 36.0 11232 0.3490 0.5271
0.3635 37.0 11544 0.3478 0.5271
0.3635 38.0 11856 0.3479 0.5271
0.3634 39.0 12168 0.3501 0.5271
0.3634 40.0 12480 0.3478 0.5271
0.3643 41.0 12792 0.3479 0.5271
0.3645 42.0 13104 0.3655 0.4729
0.3645 43.0 13416 0.3512 0.5271
0.363 44.0 13728 0.3491 0.5271
0.3602 45.0 14040 0.3569 0.4729
0.3602 46.0 14352 0.3571 0.4729
0.3616 47.0 14664 0.3522 0.5307
0.3616 48.0 14976 0.3485 0.5271
0.3601 49.0 15288 0.3485 0.5271
0.3606 50.0 15600 0.3481 0.5271
0.3606 51.0 15912 0.3484 0.5271
0.3592 52.0 16224 0.3478 0.5271
0.3587 53.0 16536 0.3485 0.5271
0.3587 54.0 16848 0.3483 0.5271
0.3583 55.0 17160 0.3480 0.5271
0.3583 56.0 17472 0.3478 0.5271
0.358 57.0 17784 0.3485 0.5271
0.3574 58.0 18096 0.3478 0.5271
0.3574 59.0 18408 0.3479 0.5271
0.3567 60.0 18720 0.3479 0.5271

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
1

Dataset used to train dkqjrm/20230822124255