Edit model card

20230823015121

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0704
  • Accuracy: 0.4729

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.002
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 312 0.0750 0.4657
0.1248 2.0 624 0.0716 0.5343
0.1248 3.0 936 0.1400 0.4729
0.0956 4.0 1248 0.1243 0.5271
0.0927 5.0 1560 0.0735 0.5668
0.0927 6.0 1872 0.1167 0.5271
0.0924 7.0 2184 0.0803 0.4729
0.0924 8.0 2496 0.0714 0.4982
0.0913 9.0 2808 0.0730 0.4729
0.0885 10.0 3120 0.0708 0.5343
0.0885 11.0 3432 0.0702 0.4910
0.0817 12.0 3744 0.0703 0.5307
0.0789 13.0 4056 0.0723 0.4729
0.0789 14.0 4368 0.0700 0.4874
0.0785 15.0 4680 0.0700 0.4765
0.0785 16.0 4992 0.0701 0.4801
0.0788 17.0 5304 0.0733 0.4549
0.0926 18.0 5616 0.0858 0.5307
0.0926 19.0 5928 0.0739 0.4982
0.0845 20.0 6240 0.0944 0.5235
0.0826 21.0 6552 0.0717 0.4621
0.0826 22.0 6864 0.0710 0.4729
0.0818 23.0 7176 0.0712 0.4910
0.0818 24.0 7488 0.0714 0.4838
0.0809 25.0 7800 0.0745 0.5126
0.0805 26.0 8112 0.0714 0.4729
0.0805 27.0 8424 0.0738 0.4729
0.0805 28.0 8736 0.0709 0.4765
0.0809 29.0 9048 0.0737 0.4729
0.0809 30.0 9360 0.0729 0.5596
0.0797 31.0 9672 0.0736 0.5596
0.0797 32.0 9984 0.0705 0.4693
0.08 33.0 10296 0.0711 0.4657
0.0798 34.0 10608 0.0731 0.5199
0.0798 35.0 10920 0.0744 0.4729
0.0795 36.0 11232 0.0721 0.4729
0.0796 37.0 11544 0.0708 0.4765
0.0796 38.0 11856 0.0714 0.4729
0.0792 39.0 12168 0.0707 0.4729
0.0792 40.0 12480 0.0705 0.4693
0.0785 41.0 12792 0.0706 0.4729
0.0782 42.0 13104 0.0708 0.4765
0.0782 43.0 13416 0.0709 0.4765
0.0779 44.0 13728 0.0705 0.4729
0.078 45.0 14040 0.0705 0.4729
0.078 46.0 14352 0.0704 0.4729
0.0776 47.0 14664 0.0708 0.4729
0.0776 48.0 14976 0.0704 0.4729
0.0778 49.0 15288 0.0704 0.4729
0.0778 50.0 15600 0.0709 0.4729
0.0778 51.0 15912 0.0709 0.4729
0.078 52.0 16224 0.0712 0.4729
0.0776 53.0 16536 0.0704 0.4729
0.0776 54.0 16848 0.0708 0.4729
0.0772 55.0 17160 0.0717 0.4729
0.0772 56.0 17472 0.0705 0.4729
0.0772 57.0 17784 0.0703 0.4729
0.0774 58.0 18096 0.0710 0.4729
0.0774 59.0 18408 0.0704 0.4729
0.0774 60.0 18720 0.0704 0.4729

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
6

Dataset used to train dkqjrm/20230823015121