Edit model card

20230822144501

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3480
  • Accuracy: 0.5271

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 312 0.3687 0.4729
0.3745 2.0 624 0.3523 0.5379
0.3745 3.0 936 0.3598 0.4729
0.3798 4.0 1248 0.3479 0.5271
0.3752 5.0 1560 0.3593 0.4729
0.3752 6.0 1872 0.3505 0.5271
0.373 7.0 2184 0.3480 0.5271
0.373 8.0 2496 0.3593 0.4729
0.3724 9.0 2808 0.3490 0.5271
0.3669 10.0 3120 0.3489 0.5271
0.3669 11.0 3432 0.3487 0.5271
0.3681 12.0 3744 0.3588 0.4729
0.3636 13.0 4056 0.3519 0.5271
0.3636 14.0 4368 0.3511 0.5271
0.3629 15.0 4680 0.3510 0.5271
0.3629 16.0 4992 0.3478 0.5271
0.3591 17.0 5304 0.3502 0.5271
0.3564 18.0 5616 0.3481 0.5271
0.3564 19.0 5928 0.3511 0.5271
0.3573 20.0 6240 0.3512 0.5271
0.3574 21.0 6552 0.3481 0.5271
0.3574 22.0 6864 0.3488 0.5271
0.3566 23.0 7176 0.3516 0.5271
0.3566 24.0 7488 0.3483 0.5271
0.3571 25.0 7800 0.3478 0.5271
0.3562 26.0 8112 0.3478 0.5271
0.3562 27.0 8424 0.3534 0.5271
0.356 28.0 8736 0.3482 0.5271
0.3564 29.0 9048 0.3479 0.5271
0.3564 30.0 9360 0.3506 0.5271
0.3566 31.0 9672 0.3481 0.5271
0.3566 32.0 9984 0.3480 0.5271
0.3552 33.0 10296 0.3479 0.5271
0.3558 34.0 10608 0.3483 0.5271
0.3558 35.0 10920 0.3482 0.5271
0.3553 36.0 11232 0.3494 0.5271
0.3546 37.0 11544 0.3478 0.5271
0.3546 38.0 11856 0.3491 0.5271
0.3558 39.0 12168 0.3479 0.5271
0.3558 40.0 12480 0.3486 0.5271
0.3558 41.0 12792 0.3480 0.5271
0.3551 42.0 13104 0.3495 0.5271
0.3551 43.0 13416 0.3479 0.5271
0.3563 44.0 13728 0.3480 0.5271
0.3549 45.0 14040 0.3503 0.5271
0.3549 46.0 14352 0.3490 0.5271
0.355 47.0 14664 0.3493 0.5271
0.355 48.0 14976 0.3479 0.5271
0.3551 49.0 15288 0.3484 0.5271
0.3558 50.0 15600 0.3479 0.5271
0.3558 51.0 15912 0.3480 0.5271
0.3542 52.0 16224 0.3488 0.5271
0.3553 53.0 16536 0.3483 0.5271
0.3553 54.0 16848 0.3485 0.5271
0.3544 55.0 17160 0.3481 0.5271
0.3544 56.0 17472 0.3480 0.5271
0.3549 57.0 17784 0.3483 0.5271
0.3544 58.0 18096 0.3481 0.5271
0.3544 59.0 18408 0.3481 0.5271
0.3537 60.0 18720 0.3480 0.5271

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
5

Dataset used to train dkqjrm/20230822144501