Edit model card

20230824043537

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3141
  • Accuracy: 0.7401

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.003
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
0.7925 1.0 623 0.8673 0.4729
0.6122 2.0 1246 0.4006 0.5415
0.5656 3.0 1869 1.2100 0.4729
0.5981 4.0 2492 0.4232 0.5632
0.5284 5.0 3115 0.6388 0.5523
0.6128 6.0 3738 0.4463 0.5307
0.4769 7.0 4361 0.4020 0.6065
0.4415 8.0 4984 0.3773 0.6029
0.4284 9.0 5607 0.3718 0.6679
0.3893 10.0 6230 0.3479 0.6606
0.3707 11.0 6853 0.3415 0.6751
0.3845 12.0 7476 0.3645 0.6787
0.3667 13.0 8099 0.3591 0.6895
0.3674 14.0 8722 0.3526 0.6931
0.3561 15.0 9345 0.3187 0.7292
0.342 16.0 9968 0.3318 0.7004
0.3305 17.0 10591 0.3185 0.7004
0.3269 18.0 11214 0.3733 0.6534
0.3341 19.0 11837 0.3197 0.7040
0.3214 20.0 12460 0.3166 0.7148
0.3109 21.0 13083 0.3257 0.7148
0.3125 22.0 13706 0.3299 0.7292
0.3097 23.0 14329 0.4120 0.6895
0.2918 24.0 14952 0.3158 0.7148
0.2792 25.0 15575 0.3077 0.7256
0.2766 26.0 16198 0.3078 0.7292
0.2811 27.0 16821 0.3033 0.7256
0.2719 28.0 17444 0.3017 0.7148
0.2661 29.0 18067 0.2947 0.7184
0.263 30.0 18690 0.3416 0.7329
0.2633 31.0 19313 0.3170 0.7256
0.2517 32.0 19936 0.3063 0.7220
0.2486 33.0 20559 0.3137 0.7256
0.252 34.0 21182 0.3118 0.7256
0.2396 35.0 21805 0.2980 0.7220
0.2471 36.0 22428 0.3050 0.7329
0.2361 37.0 23051 0.3366 0.7220
0.2358 38.0 23674 0.3080 0.7473
0.2231 39.0 24297 0.3191 0.7437
0.2298 40.0 24920 0.3018 0.7148
0.2241 41.0 25543 0.3090 0.7401
0.2243 42.0 26166 0.3137 0.7401
0.2237 43.0 26789 0.3277 0.7365
0.2147 44.0 27412 0.3116 0.7437
0.2149 45.0 28035 0.3289 0.7365
0.2087 46.0 28658 0.3241 0.7292
0.21 47.0 29281 0.3060 0.7365
0.214 48.0 29904 0.3311 0.7329
0.2108 49.0 30527 0.3144 0.7437
0.2029 50.0 31150 0.3094 0.7401
0.2028 51.0 31773 0.3141 0.7473
0.2018 52.0 32396 0.3188 0.7437
0.2079 53.0 33019 0.3138 0.7365
0.1982 54.0 33642 0.3109 0.7401
0.1926 55.0 34265 0.3118 0.7437
0.1972 56.0 34888 0.3270 0.7401
0.1986 57.0 35511 0.3098 0.7365
0.1928 58.0 36134 0.3131 0.7401
0.1974 59.0 36757 0.3132 0.7401
0.1927 60.0 37380 0.3141 0.7401

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
5

Dataset used to train dkqjrm/20230824043537