20230822105333

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3480
  • Accuracy: 0.5271

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.01
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 312 2.0240 0.5271
1.1081 2.0 624 0.8435 0.5271
1.1081 3.0 936 0.4636 0.4729
1.109 4.0 1248 0.3964 0.4729
0.9629 5.0 1560 0.3803 0.5271
0.9629 6.0 1872 0.3630 0.5271
0.8211 7.0 2184 0.5683 0.5271
0.8211 8.0 2496 0.3645 0.4729
0.8143 9.0 2808 0.4972 0.5271
0.8375 10.0 3120 0.4557 0.4729
0.8375 11.0 3432 0.4497 0.5271
0.7522 12.0 3744 0.4278 0.4729
0.7584 13.0 4056 0.5233 0.5271
0.7584 14.0 4368 0.4097 0.5271
0.6684 15.0 4680 0.4749 0.4729
0.6684 16.0 4992 0.7626 0.5271
0.6637 17.0 5304 0.6379 0.5271
0.5907 18.0 5616 0.3496 0.5271
0.5907 19.0 5928 0.4018 0.5271
0.5618 20.0 6240 0.3606 0.5271
0.5539 21.0 6552 0.3596 0.4729
0.5539 22.0 6864 0.4662 0.5271
0.537 23.0 7176 0.3488 0.5271
0.537 24.0 7488 0.8345 0.4729
0.5337 25.0 7800 0.3486 0.5271
0.5058 26.0 8112 0.3496 0.5271
0.5058 27.0 8424 0.5283 0.4729
0.5239 28.0 8736 0.3566 0.5271
0.4835 29.0 9048 0.3810 0.4729
0.4835 30.0 9360 0.4577 0.5271
0.4672 31.0 9672 0.4612 0.4729
0.4672 32.0 9984 0.4667 0.5271
0.4699 33.0 10296 0.3585 0.5271
0.4637 34.0 10608 0.3518 0.5271
0.4637 35.0 10920 0.4995 0.4729
0.4539 36.0 11232 0.3777 0.4729
0.4465 37.0 11544 0.3492 0.5271
0.4465 38.0 11856 0.3486 0.5271
0.4446 39.0 12168 0.3482 0.5271
0.4446 40.0 12480 0.3776 0.4729
0.437 41.0 12792 0.3485 0.5271
0.4309 42.0 13104 0.3481 0.5271
0.4309 43.0 13416 0.3657 0.5271
0.424 44.0 13728 0.3484 0.5271
0.4165 45.0 14040 0.3492 0.5271
0.4165 46.0 14352 0.3706 0.4729
0.4206 47.0 14664 0.3490 0.5271
0.4206 48.0 14976 0.3510 0.5271
0.4202 49.0 15288 0.3478 0.5271
0.4038 50.0 15600 0.3621 0.5271
0.4038 51.0 15912 0.3480 0.5271
0.3916 52.0 16224 0.4587 0.4729
0.3901 53.0 16536 0.3506 0.5271
0.3901 54.0 16848 0.3545 0.5271
0.3805 55.0 17160 0.3540 0.4729
0.3805 56.0 17472 0.3626 0.5271
0.3781 57.0 17784 0.3504 0.5271
0.3688 58.0 18096 0.3478 0.5271
0.3688 59.0 18408 0.3527 0.5271
0.3657 60.0 18720 0.3480 0.5271

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
11
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train dkqjrm/20230822105333