Edit model card

20230823053830

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0703
  • Accuracy: 0.4729

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 312 0.0743 0.4729
0.0882 2.0 624 0.0731 0.4729
0.0882 3.0 936 0.0718 0.4729
0.0871 4.0 1248 0.0712 0.4838
0.0857 5.0 1560 0.0709 0.4765
0.0857 6.0 1872 0.0718 0.4729
0.084 7.0 2184 0.0709 0.4765
0.084 8.0 2496 0.0705 0.4729
0.0831 9.0 2808 0.0710 0.4729
0.0826 10.0 3120 0.0705 0.4729
0.0826 11.0 3432 0.0726 0.4729
0.0823 12.0 3744 0.0722 0.4729
0.0814 13.0 4056 0.0710 0.4729
0.0814 14.0 4368 0.0710 0.4585
0.0807 15.0 4680 0.0706 0.4729
0.0807 16.0 4992 0.0709 0.4729
0.0803 17.0 5304 0.0709 0.4693
0.0798 18.0 5616 0.0711 0.5307
0.0798 19.0 5928 0.0708 0.4729
0.0798 20.0 6240 0.0710 0.4801
0.0792 21.0 6552 0.0710 0.5307
0.0792 22.0 6864 0.0728 0.5379
0.0797 23.0 7176 0.0707 0.4657
0.0797 24.0 7488 0.0711 0.4729
0.0793 25.0 7800 0.0706 0.4729
0.0783 26.0 8112 0.0704 0.4729
0.0783 27.0 8424 0.0706 0.4729
0.0783 28.0 8736 0.0709 0.4729
0.0782 29.0 9048 0.0703 0.4729
0.0782 30.0 9360 0.0705 0.4765
0.0782 31.0 9672 0.0709 0.5054
0.0782 32.0 9984 0.0705 0.4729
0.0786 33.0 10296 0.0704 0.4729
0.0779 34.0 10608 0.0705 0.4729
0.0779 35.0 10920 0.0715 0.4729
0.0779 36.0 11232 0.0707 0.4765
0.0779 37.0 11544 0.0703 0.4729
0.0779 38.0 11856 0.0704 0.4765
0.0778 39.0 12168 0.0704 0.4729
0.0778 40.0 12480 0.0704 0.4693
0.0776 41.0 12792 0.0704 0.4729
0.0777 42.0 13104 0.0703 0.4729
0.0777 43.0 13416 0.0707 0.4585
0.0775 44.0 13728 0.0703 0.4729
0.0777 45.0 14040 0.0705 0.4729
0.0777 46.0 14352 0.0704 0.4729
0.0772 47.0 14664 0.0730 0.4729
0.0772 48.0 14976 0.0703 0.4729
0.0774 49.0 15288 0.0706 0.4549
0.0774 50.0 15600 0.0704 0.4729
0.0774 51.0 15912 0.0706 0.4729
0.0778 52.0 16224 0.0705 0.4729
0.0775 53.0 16536 0.0704 0.4729
0.0775 54.0 16848 0.0704 0.4765
0.0772 55.0 17160 0.0704 0.4729
0.0772 56.0 17472 0.0703 0.4729
0.077 57.0 17784 0.0703 0.4729
0.0774 58.0 18096 0.0706 0.4729
0.0774 59.0 18408 0.0704 0.4729
0.0776 60.0 18720 0.0703 0.4729

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
5

Dataset used to train dkqjrm/20230823053830