Edit model card

20230825183835

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3648
  • Accuracy: 0.7473

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.005
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 80.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 156 0.8052 0.5307
No log 2.0 312 0.6957 0.4801
No log 3.0 468 0.9722 0.4801
0.8916 4.0 624 0.7219 0.5560
0.8916 5.0 780 0.5572 0.5921
0.8916 6.0 936 0.4803 0.6534
0.8141 7.0 1092 0.6885 0.6318
0.8141 8.0 1248 0.4588 0.6895
0.8141 9.0 1404 1.0159 0.4729
0.7176 10.0 1560 0.4835 0.6823
0.7176 11.0 1716 0.5513 0.6823
0.7176 12.0 1872 0.4150 0.7184
0.6445 13.0 2028 0.4789 0.7148
0.6445 14.0 2184 0.4414 0.7220
0.6445 15.0 2340 0.3778 0.6968
0.6445 16.0 2496 0.5422 0.6823
0.6267 17.0 2652 0.3654 0.7220
0.6267 18.0 2808 0.7434 0.6390
0.6267 19.0 2964 0.3713 0.7112
0.5715 20.0 3120 0.3942 0.6931
0.5715 21.0 3276 0.3785 0.7112
0.5715 22.0 3432 0.5429 0.6570
0.5015 23.0 3588 0.3600 0.7365
0.5015 24.0 3744 0.4567 0.7473
0.5015 25.0 3900 0.3680 0.7148
0.4739 26.0 4056 0.3348 0.7292
0.4739 27.0 4212 0.4191 0.7437
0.4739 28.0 4368 0.4034 0.7401
0.4139 29.0 4524 0.3887 0.7112
0.4139 30.0 4680 0.4222 0.7004
0.4139 31.0 4836 0.3804 0.7220
0.4139 32.0 4992 0.3842 0.7256
0.3958 33.0 5148 0.3851 0.7365
0.3958 34.0 5304 0.4758 0.7040
0.3958 35.0 5460 0.3569 0.7473
0.3561 36.0 5616 0.3971 0.7256
0.3561 37.0 5772 0.4006 0.7545
0.3561 38.0 5928 0.5292 0.7220
0.3349 39.0 6084 0.4014 0.7329
0.3349 40.0 6240 0.3285 0.7473
0.3349 41.0 6396 0.3665 0.7581
0.2946 42.0 6552 0.3843 0.7690
0.2946 43.0 6708 0.3634 0.7509
0.2946 44.0 6864 0.3518 0.7437
0.2813 45.0 7020 0.4009 0.7473
0.2813 46.0 7176 0.4073 0.7653
0.2813 47.0 7332 0.3974 0.7473
0.2813 48.0 7488 0.4134 0.7437
0.2601 49.0 7644 0.3661 0.7437
0.2601 50.0 7800 0.3733 0.7437
0.2601 51.0 7956 0.3425 0.7509
0.242 52.0 8112 0.4186 0.7473
0.242 53.0 8268 0.4262 0.7401
0.242 54.0 8424 0.3627 0.7437
0.2356 55.0 8580 0.3966 0.7473
0.2356 56.0 8736 0.3819 0.7509
0.2356 57.0 8892 0.4087 0.7473
0.2198 58.0 9048 0.3691 0.7365
0.2198 59.0 9204 0.4938 0.7437
0.2198 60.0 9360 0.4097 0.7581
0.1995 61.0 9516 0.3870 0.7509
0.1995 62.0 9672 0.4417 0.7473
0.1995 63.0 9828 0.3596 0.7509
0.1995 64.0 9984 0.3483 0.7473
0.1933 65.0 10140 0.4424 0.7545
0.1933 66.0 10296 0.3443 0.7437
0.1933 67.0 10452 0.3820 0.7437
0.1898 68.0 10608 0.3889 0.7473
0.1898 69.0 10764 0.3841 0.7437
0.1898 70.0 10920 0.4081 0.7581
0.1813 71.0 11076 0.3680 0.7473
0.1813 72.0 11232 0.3775 0.7473
0.1813 73.0 11388 0.3713 0.7473
0.1688 74.0 11544 0.3765 0.7473
0.1688 75.0 11700 0.3580 0.7509
0.1688 76.0 11856 0.3485 0.7437
0.1663 77.0 12012 0.3601 0.7509
0.1663 78.0 12168 0.3721 0.7509
0.1663 79.0 12324 0.3633 0.7473
0.1663 80.0 12480 0.3648 0.7473

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
11

Dataset used to train dkqjrm/20230825183835