Edit model card

20230822185044

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3482
  • Accuracy: 0.4729

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.003
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 312 0.3580 0.5379
0.5102 2.0 624 0.3670 0.5415
0.5102 3.0 936 0.4888 0.4765
0.4569 4.0 1248 0.3742 0.4982
0.4403 5.0 1560 0.3796 0.5379
0.4403 6.0 1872 0.3602 0.5776
0.4215 7.0 2184 0.4013 0.5415
0.4215 8.0 2496 0.3596 0.5884
0.4166 9.0 2808 0.3447 0.5487
0.3885 10.0 3120 0.3395 0.6101
0.3885 11.0 3432 0.3395 0.6354
0.3776 12.0 3744 0.3568 0.5343
0.4274 13.0 4056 0.5923 0.4729
0.4274 14.0 4368 0.3503 0.5668
0.4138 15.0 4680 0.3605 0.5523
0.4138 16.0 4992 0.3491 0.5451
0.4025 17.0 5304 0.3728 0.5379
0.394 18.0 5616 0.4029 0.4729
0.394 19.0 5928 0.3682 0.4729
0.3892 20.0 6240 0.3484 0.5054
0.3839 21.0 6552 0.3485 0.4765
0.3839 22.0 6864 0.3467 0.5343
0.3782 23.0 7176 0.3471 0.5307
0.3782 24.0 7488 0.3565 0.4693
0.3757 25.0 7800 0.3483 0.5343
0.3737 26.0 8112 0.3495 0.5271
0.3737 27.0 8424 0.3550 0.4729
0.3724 28.0 8736 0.3544 0.4729
0.3696 29.0 9048 0.3478 0.5307
0.3696 30.0 9360 0.3519 0.5271
0.3693 31.0 9672 0.3515 0.5271
0.3693 32.0 9984 0.3487 0.4729
0.3674 33.0 10296 0.3492 0.5379
0.3628 34.0 10608 0.3555 0.4729
0.3628 35.0 10920 0.3550 0.4729
0.3635 36.0 11232 0.3686 0.4729
0.3636 37.0 11544 0.3488 0.4801
0.3636 38.0 11856 0.3484 0.4874
0.3595 39.0 12168 0.3477 0.4910
0.3595 40.0 12480 0.3486 0.5307
0.3598 41.0 12792 0.3488 0.4801
0.3594 42.0 13104 0.3614 0.4729
0.3594 43.0 13416 0.3476 0.5199
0.3586 44.0 13728 0.3482 0.4729
0.3581 45.0 14040 0.3519 0.4729
0.3581 46.0 14352 0.3494 0.4729
0.3579 47.0 14664 0.3613 0.4729
0.3579 48.0 14976 0.3480 0.4729
0.3573 49.0 15288 0.3480 0.4729
0.3564 50.0 15600 0.3487 0.4729
0.3564 51.0 15912 0.3529 0.4729
0.3561 52.0 16224 0.3515 0.4729
0.3554 53.0 16536 0.3475 0.4946
0.3554 54.0 16848 0.3489 0.5271
0.3535 55.0 17160 0.3488 0.4729
0.3535 56.0 17472 0.3478 0.5018
0.3542 57.0 17784 0.3491 0.4729
0.354 58.0 18096 0.3485 0.4729
0.354 59.0 18408 0.3483 0.4729
0.3529 60.0 18720 0.3482 0.4729

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
1

Dataset used to train dkqjrm/20230822185044