Edit model card

20230823073139

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0707
  • Accuracy: 0.4729

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 312 0.0712 0.4729
0.0874 2.0 624 0.0707 0.4729
0.0874 3.0 936 0.0707 0.4729
0.0879 4.0 1248 0.0707 0.4729
0.0869 5.0 1560 0.0710 0.4729
0.0869 6.0 1872 0.0707 0.4729
0.0872 7.0 2184 0.0708 0.4729
0.0872 8.0 2496 0.0708 0.4729
0.0865 9.0 2808 0.0712 0.4729
0.0864 10.0 3120 0.0707 0.4729
0.0864 11.0 3432 0.0707 0.4729
0.0866 12.0 3744 0.0709 0.4729
0.0862 13.0 4056 0.0710 0.4729
0.0862 14.0 4368 0.0709 0.4729
0.0865 15.0 4680 0.0712 0.4729
0.0865 16.0 4992 0.0708 0.4729
0.0859 17.0 5304 0.0706 0.4729
0.0855 18.0 5616 0.0708 0.4729
0.0855 19.0 5928 0.0706 0.4729
0.086 20.0 6240 0.0706 0.4729
0.0852 21.0 6552 0.0707 0.4729
0.0852 22.0 6864 0.0708 0.4729
0.0865 23.0 7176 0.0706 0.4729
0.0865 24.0 7488 0.0706 0.4729
0.0862 25.0 7800 0.0706 0.4729
0.0849 26.0 8112 0.0706 0.4729
0.0849 27.0 8424 0.0711 0.4729
0.0847 28.0 8736 0.0706 0.4729
0.0846 29.0 9048 0.0708 0.4729
0.0846 30.0 9360 0.0705 0.4729
0.0847 31.0 9672 0.0708 0.4729
0.0847 32.0 9984 0.0706 0.4729
0.0854 33.0 10296 0.0706 0.4729
0.084 34.0 10608 0.0706 0.4729
0.084 35.0 10920 0.0709 0.4729
0.0845 36.0 11232 0.0707 0.4729
0.0842 37.0 11544 0.0706 0.4729
0.0842 38.0 11856 0.0706 0.4729
0.0847 39.0 12168 0.0706 0.4729
0.0847 40.0 12480 0.0706 0.4729
0.0839 41.0 12792 0.0705 0.4729
0.0848 42.0 13104 0.0706 0.4729
0.0848 43.0 13416 0.0706 0.4729
0.0841 44.0 13728 0.0706 0.4729
0.0845 45.0 14040 0.0709 0.4729
0.0845 46.0 14352 0.0706 0.4729
0.0842 47.0 14664 0.0707 0.4729
0.0842 48.0 14976 0.0707 0.4729
0.0842 49.0 15288 0.0707 0.4729
0.0837 50.0 15600 0.0706 0.4729
0.0837 51.0 15912 0.0706 0.4729
0.0845 52.0 16224 0.0707 0.4729
0.0844 53.0 16536 0.0707 0.4729
0.0844 54.0 16848 0.0706 0.4729
0.0846 55.0 17160 0.0706 0.4729
0.0846 56.0 17472 0.0706 0.4729
0.0836 57.0 17784 0.0706 0.4729
0.0847 58.0 18096 0.0707 0.4729
0.0847 59.0 18408 0.0707 0.4729
0.0849 60.0 18720 0.0707 0.4729

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
4

Dataset used to train dkqjrm/20230823073139