Edit model card

20230824002436

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3260
  • Accuracy: 0.7292

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.003
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
0.7465 1.0 623 0.7703 0.4729
0.5877 2.0 1246 0.4425 0.5090
0.601 3.0 1869 0.7166 0.4729
0.6044 4.0 2492 0.3819 0.6354
0.5254 5.0 3115 0.3658 0.6715
0.5948 6.0 3738 0.3669 0.6390
0.5077 7.0 4361 0.5755 0.5993
0.5077 8.0 4984 0.3389 0.7112
0.4643 9.0 5607 0.3560 0.6823
0.4111 10.0 6230 0.3375 0.7076
0.375 11.0 6853 0.3139 0.7256
0.3742 12.0 7476 0.3819 0.6787
0.3684 13.0 8099 0.3748 0.6751
0.3583 14.0 8722 0.3890 0.7076
0.3353 15.0 9345 0.3064 0.6968
0.3161 16.0 9968 0.3082 0.7184
0.3045 17.0 10591 0.3120 0.7040
0.295 18.0 11214 0.2949 0.7292
0.3133 19.0 11837 0.3082 0.7365
0.2938 20.0 12460 0.3041 0.7473
0.2857 21.0 13083 0.3251 0.7401
0.2837 22.0 13706 0.3717 0.7256
0.2788 23.0 14329 0.4261 0.7112
0.2677 24.0 14952 0.3189 0.7148
0.2494 25.0 15575 0.3107 0.7365
0.2404 26.0 16198 0.3337 0.7473
0.245 27.0 16821 0.3148 0.7329
0.2475 28.0 17444 0.3240 0.7401
0.2377 29.0 18067 0.3512 0.7329
0.2354 30.0 18690 0.3480 0.7365
0.2335 31.0 19313 0.3320 0.7256
0.2265 32.0 19936 0.3071 0.7184
0.2184 33.0 20559 0.3501 0.7509
0.2189 34.0 21182 0.3220 0.7112
0.2157 35.0 21805 0.3174 0.7256
0.2238 36.0 22428 0.3203 0.7292
0.2099 37.0 23051 0.3346 0.7365
0.2084 38.0 23674 0.3103 0.7365
0.195 39.0 24297 0.3193 0.7292
0.201 40.0 24920 0.3131 0.7112
0.1936 41.0 25543 0.3101 0.7220
0.1944 42.0 26166 0.3092 0.7256
0.1975 43.0 26789 0.3314 0.7329
0.1864 44.0 27412 0.3140 0.7437
0.189 45.0 28035 0.3402 0.7256
0.1855 46.0 28658 0.3229 0.7220
0.1813 47.0 29281 0.3156 0.7256
0.1877 48.0 29904 0.3352 0.7292
0.1852 49.0 30527 0.3230 0.7365
0.1813 50.0 31150 0.3210 0.7329
0.1789 51.0 31773 0.3391 0.7365
0.1764 52.0 32396 0.3290 0.7292
0.1837 53.0 33019 0.3237 0.7365
0.1759 54.0 33642 0.3219 0.7292
0.1681 55.0 34265 0.3169 0.7401
0.1769 56.0 34888 0.3361 0.7329
0.1725 57.0 35511 0.3282 0.7401
0.1681 58.0 36134 0.3257 0.7365
0.1729 59.0 36757 0.3269 0.7292
0.1694 60.0 37380 0.3260 0.7292

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
5

Dataset used to train dkqjrm/20230824002436