Edit model card

20230825071702

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2804
  • Accuracy: 0.7617

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.005
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 80.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 156 0.6793 0.5307
No log 2.0 312 0.9039 0.4765
No log 3.0 468 0.7107 0.4729
0.8982 4.0 624 0.6969 0.5199
0.8982 5.0 780 0.5729 0.5560
0.8982 6.0 936 0.6447 0.5596
0.8495 7.0 1092 0.6093 0.5921
0.8495 8.0 1248 0.4289 0.6679
0.8495 9.0 1404 0.4954 0.6282
0.751 10.0 1560 0.3952 0.6715
0.751 11.0 1716 0.6147 0.6462
0.751 12.0 1872 0.4183 0.7004
0.6407 13.0 2028 0.3743 0.6968
0.6407 14.0 2184 0.3907 0.7292
0.6407 15.0 2340 0.3409 0.7148
0.6407 16.0 2496 0.5288 0.6426
0.6476 17.0 2652 0.4492 0.7220
0.6476 18.0 2808 0.3312 0.7220
0.6476 19.0 2964 0.4062 0.6606
0.6425 20.0 3120 0.3715 0.6859
0.6425 21.0 3276 0.3305 0.7256
0.6425 22.0 3432 0.6557 0.6245
0.5658 23.0 3588 0.3943 0.6859
0.5658 24.0 3744 0.3394 0.7040
0.5658 25.0 3900 0.4640 0.6823
0.5333 26.0 4056 0.3419 0.7220
0.5333 27.0 4212 0.3646 0.7112
0.5333 28.0 4368 0.3626 0.7184
0.5164 29.0 4524 0.3215 0.7473
0.5164 30.0 4680 0.2941 0.7581
0.5164 31.0 4836 0.4957 0.6173
0.5164 32.0 4992 0.3362 0.7329
0.4676 33.0 5148 0.3116 0.7437
0.4676 34.0 5304 0.3344 0.7401
0.4676 35.0 5460 0.4769 0.7220
0.4443 36.0 5616 0.2822 0.7509
0.4443 37.0 5772 0.3748 0.6859
0.4443 38.0 5928 0.2989 0.7509
0.4179 39.0 6084 0.3193 0.7292
0.4179 40.0 6240 0.3725 0.6715
0.4179 41.0 6396 0.3336 0.7509
0.3974 42.0 6552 0.2967 0.7365
0.3974 43.0 6708 0.2908 0.7545
0.3974 44.0 6864 0.2887 0.7473
0.3774 45.0 7020 0.3012 0.7401
0.3774 46.0 7176 0.3437 0.7509
0.3774 47.0 7332 0.3390 0.7292
0.3774 48.0 7488 0.2952 0.7473
0.3419 49.0 7644 0.3116 0.7401
0.3419 50.0 7800 0.2856 0.7473
0.3419 51.0 7956 0.3227 0.7256
0.3275 52.0 8112 0.2861 0.7509
0.3275 53.0 8268 0.3534 0.7401
0.3275 54.0 8424 0.3395 0.7256
0.3225 55.0 8580 0.3113 0.7401
0.3225 56.0 8736 0.2932 0.7473
0.3225 57.0 8892 0.4312 0.7112
0.3104 58.0 9048 0.3085 0.7509
0.3104 59.0 9204 0.3164 0.7545
0.3104 60.0 9360 0.2758 0.7473
0.3164 61.0 9516 0.3183 0.7220
0.3164 62.0 9672 0.3571 0.7220
0.3164 63.0 9828 0.3156 0.7365
0.3164 64.0 9984 0.2756 0.7653
0.2939 65.0 10140 0.2859 0.7437
0.2939 66.0 10296 0.2934 0.7545
0.2939 67.0 10452 0.2977 0.7690
0.2826 68.0 10608 0.2871 0.7653
0.2826 69.0 10764 0.2903 0.7653
0.2826 70.0 10920 0.2974 0.7581
0.2663 71.0 11076 0.2778 0.7509
0.2663 72.0 11232 0.2849 0.7365
0.2663 73.0 11388 0.2970 0.7653
0.2637 74.0 11544 0.3025 0.7545
0.2637 75.0 11700 0.2793 0.7617
0.2637 76.0 11856 0.2778 0.7545
0.2699 77.0 12012 0.2861 0.7617
0.2699 78.0 12168 0.2857 0.7690
0.2699 79.0 12324 0.2774 0.7617
0.2699 80.0 12480 0.2804 0.7617

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
11

Dataset used to train dkqjrm/20230825071702