Edit model card

20230822185221

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3289
  • Accuracy: 0.7329

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.002
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 312 0.5077 0.5307
0.4439 2.0 624 0.3971 0.4874
0.4439 3.0 936 0.3574 0.5379
0.4231 4.0 1248 0.3625 0.5776
0.4071 5.0 1560 0.4937 0.5343
0.4071 6.0 1872 0.3738 0.5668
0.3956 7.0 2184 0.4081 0.4729
0.3956 8.0 2496 0.3386 0.6209
0.3905 9.0 2808 0.4147 0.4729
0.3888 10.0 3120 0.3353 0.6354
0.3888 11.0 3432 0.3540 0.6282
0.3992 12.0 3744 0.3453 0.5848
0.372 13.0 4056 0.3265 0.6895
0.372 14.0 4368 0.3575 0.6426
0.3643 15.0 4680 0.3304 0.6498
0.3643 16.0 4992 0.3633 0.6715
0.3666 17.0 5304 0.5230 0.5343
0.3517 18.0 5616 0.3384 0.6462
0.3517 19.0 5928 0.3293 0.6823
0.3519 20.0 6240 0.3613 0.6823
0.338 21.0 6552 0.3242 0.7256
0.338 22.0 6864 0.3399 0.7184
0.3316 23.0 7176 0.3392 0.7004
0.3316 24.0 7488 0.3343 0.6534
0.3266 25.0 7800 0.3467 0.7112
0.3213 26.0 8112 0.3419 0.7040
0.3213 27.0 8424 0.3190 0.7112
0.3177 28.0 8736 0.3205 0.6931
0.3187 29.0 9048 0.3303 0.7076
0.3187 30.0 9360 0.3268 0.7148
0.3162 31.0 9672 0.3274 0.7148
0.3162 32.0 9984 0.3311 0.7112
0.3132 33.0 10296 0.3454 0.7148
0.3087 34.0 10608 0.3250 0.7076
0.3087 35.0 10920 0.3266 0.7076
0.3076 36.0 11232 0.3347 0.7292
0.3071 37.0 11544 0.3308 0.7112
0.3071 38.0 11856 0.3272 0.7220
0.3061 39.0 12168 0.3301 0.7148
0.3061 40.0 12480 0.3226 0.7256
0.3006 41.0 12792 0.3285 0.7365
0.3016 42.0 13104 0.3226 0.7148
0.3016 43.0 13416 0.3291 0.7220
0.2984 44.0 13728 0.3377 0.7112
0.2976 45.0 14040 0.3326 0.7220
0.2976 46.0 14352 0.3341 0.7292
0.2967 47.0 14664 0.3187 0.7184
0.2967 48.0 14976 0.3322 0.7148
0.2953 49.0 15288 0.3269 0.7365
0.2911 50.0 15600 0.3256 0.7365
0.2911 51.0 15912 0.3252 0.7256
0.2929 52.0 16224 0.3251 0.7292
0.2904 53.0 16536 0.3258 0.7256
0.2904 54.0 16848 0.3358 0.7220
0.2895 55.0 17160 0.3219 0.7329
0.2895 56.0 17472 0.3322 0.7329
0.2887 57.0 17784 0.3259 0.7365
0.2883 58.0 18096 0.3260 0.7292
0.2883 59.0 18408 0.3276 0.7365
0.2874 60.0 18720 0.3289 0.7329

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
4

Dataset used to train dkqjrm/20230822185221