Edit model card

20230822144236

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3486
  • Accuracy: 0.5235

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 312 0.3705 0.4729
0.3743 2.0 624 0.3484 0.5162
0.3743 3.0 936 0.3504 0.5162
0.3726 4.0 1248 0.3527 0.5235
0.3712 5.0 1560 0.3552 0.4729
0.3712 6.0 1872 0.3480 0.5199
0.3669 7.0 2184 0.3501 0.4729
0.3669 8.0 2496 0.3503 0.4368
0.3658 9.0 2808 0.3503 0.5343
0.3656 10.0 3120 0.3483 0.5199
0.3656 11.0 3432 0.3510 0.4729
0.3634 12.0 3744 0.3557 0.4729
0.3613 13.0 4056 0.3537 0.4729
0.3613 14.0 4368 0.3505 0.5199
0.3609 15.0 4680 0.3493 0.5199
0.3609 16.0 4992 0.3488 0.5307
0.3591 17.0 5304 0.3568 0.5235
0.3574 18.0 5616 0.3486 0.5235
0.3574 19.0 5928 0.3552 0.4729
0.3599 20.0 6240 0.3553 0.5271
0.3556 21.0 6552 0.3502 0.5307
0.3556 22.0 6864 0.3525 0.5271
0.3573 23.0 7176 0.3553 0.5199
0.3573 24.0 7488 0.3492 0.5162
0.3574 25.0 7800 0.3492 0.5235
0.3559 26.0 8112 0.3531 0.4729
0.3559 27.0 8424 0.3602 0.4729
0.3544 28.0 8736 0.3501 0.5379
0.3539 29.0 9048 0.3490 0.5018
0.3539 30.0 9360 0.3491 0.5090
0.3529 31.0 9672 0.3518 0.5271
0.3529 32.0 9984 0.3489 0.5199
0.3531 33.0 10296 0.3484 0.5307
0.3527 34.0 10608 0.3487 0.5271
0.3527 35.0 10920 0.3491 0.5307
0.3521 36.0 11232 0.3498 0.5343
0.3513 37.0 11544 0.3500 0.5235
0.3513 38.0 11856 0.3487 0.5235
0.3526 39.0 12168 0.3494 0.5415
0.3526 40.0 12480 0.3495 0.5451
0.352 41.0 12792 0.3489 0.5343
0.353 42.0 13104 0.3530 0.4729
0.353 43.0 13416 0.3492 0.5271
0.3509 44.0 13728 0.3501 0.4693
0.3523 45.0 14040 0.3525 0.4729
0.3523 46.0 14352 0.3491 0.5054
0.3506 47.0 14664 0.3515 0.4729
0.3506 48.0 14976 0.3494 0.5379
0.3518 49.0 15288 0.3483 0.5235
0.3507 50.0 15600 0.3490 0.5271
0.3507 51.0 15912 0.3489 0.5379
0.3514 52.0 16224 0.3490 0.5090
0.3509 53.0 16536 0.3484 0.5235
0.3509 54.0 16848 0.3486 0.5199
0.3499 55.0 17160 0.3485 0.5199
0.3499 56.0 17472 0.3486 0.5199
0.3504 57.0 17784 0.3493 0.5415
0.3495 58.0 18096 0.3486 0.5307
0.3495 59.0 18408 0.3485 0.5271
0.3505 60.0 18720 0.3486 0.5235

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
1

Dataset used to train dkqjrm/20230822144236