Edit model card

20230822105327

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3487
  • Accuracy: 0.4729

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.01
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 312 0.6780 0.5307
0.8761 2.0 624 0.3516 0.4982
0.8761 3.0 936 0.4775 0.4874
0.685 4.0 1248 0.3842 0.5162
0.4946 5.0 1560 0.7400 0.5271
0.4946 6.0 1872 0.3490 0.5307
0.5112 7.0 2184 0.4549 0.5271
0.5112 8.0 2496 0.4590 0.4729
0.4328 9.0 2808 0.4122 0.4729
0.5336 10.0 3120 0.3692 0.4729
0.5336 11.0 3432 0.3493 0.5271
0.4659 12.0 3744 0.4285 0.4729
0.4383 13.0 4056 0.3805 0.4729
0.4383 14.0 4368 0.3634 0.5271
0.4394 15.0 4680 0.3485 0.5271
0.4394 16.0 4992 0.4393 0.4729
0.4432 17.0 5304 0.3694 0.5271
0.4138 18.0 5616 0.3503 0.4874
0.4138 19.0 5928 0.3916 0.4729
0.4213 20.0 6240 0.3495 0.4693
0.4042 21.0 6552 0.3493 0.5090
0.4042 22.0 6864 0.3556 0.5307
0.4177 23.0 7176 0.3697 0.4729
0.4177 24.0 7488 0.3484 0.4765
0.3925 25.0 7800 0.3665 0.5271
0.4006 26.0 8112 0.3669 0.5271
0.4006 27.0 8424 0.3556 0.4729
0.397 28.0 8736 0.3529 0.4729
0.3926 29.0 9048 0.3477 0.4729
0.3926 30.0 9360 0.5391 0.5271
0.39 31.0 9672 0.3504 0.4729
0.39 32.0 9984 0.3494 0.5271
0.3902 33.0 10296 0.3549 0.5271
0.3824 34.0 10608 0.3707 0.4729
0.3824 35.0 10920 0.3559 0.4729
0.3805 36.0 11232 0.3578 0.4729
0.38 37.0 11544 0.3612 0.5271
0.38 38.0 11856 0.3517 0.4729
0.3784 39.0 12168 0.3487 0.4910
0.3784 40.0 12480 0.3606 0.4729
0.3751 41.0 12792 0.3520 0.5271
0.3718 42.0 13104 0.3477 0.5199
0.3718 43.0 13416 0.3498 0.4729
0.371 44.0 13728 0.3729 0.4729
0.3723 45.0 14040 0.3592 0.5271
0.3723 46.0 14352 0.3502 0.4621
0.3688 47.0 14664 0.3516 0.4729
0.3688 48.0 14976 0.3505 0.4729
0.3641 49.0 15288 0.3526 0.4729
0.3645 50.0 15600 0.3488 0.4729
0.3645 51.0 15912 0.3482 0.4729
0.3636 52.0 16224 0.3557 0.4729
0.3621 53.0 16536 0.3484 0.4729
0.3621 54.0 16848 0.3509 0.5271
0.3581 55.0 17160 0.3519 0.4729
0.3581 56.0 17472 0.3479 0.5090
0.3573 57.0 17784 0.3480 0.4729
0.3553 58.0 18096 0.3489 0.4729
0.3553 59.0 18408 0.3479 0.4729
0.3545 60.0 18720 0.3487 0.4729

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
1

Dataset used to train dkqjrm/20230822105327