Edit model card

20230817181727

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3316
  • Accuracy: 0.7365

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.004
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 156 0.4741 0.5307
No log 2.0 312 0.3849 0.5090
No log 3.0 468 0.4345 0.4729
0.5496 4.0 624 0.4749 0.5235
0.5496 5.0 780 0.4138 0.5343
0.5496 6.0 936 0.3599 0.5632
0.4365 7.0 1092 0.3954 0.5632
0.4365 8.0 1248 0.3455 0.5018
0.4365 9.0 1404 0.3985 0.5776
0.4109 10.0 1560 0.3828 0.5993
0.4109 11.0 1716 0.4339 0.4729
0.4109 12.0 1872 0.3432 0.5379
0.3611 13.0 2028 0.3395 0.6137
0.3611 14.0 2184 0.3404 0.6715
0.3611 15.0 2340 0.3396 0.6570
0.3611 16.0 2496 0.3857 0.6354
0.3456 17.0 2652 0.3480 0.6895
0.3456 18.0 2808 0.3348 0.7040
0.3456 19.0 2964 0.3323 0.6426
0.3391 20.0 3120 0.3591 0.6715
0.3391 21.0 3276 0.3378 0.7148
0.3391 22.0 3432 0.3453 0.7004
0.3319 23.0 3588 0.3405 0.6679
0.3319 24.0 3744 0.3451 0.6390
0.3319 25.0 3900 0.3665 0.6895
0.3274 26.0 4056 0.3290 0.7112
0.3274 27.0 4212 0.3252 0.7040
0.3274 28.0 4368 0.3265 0.7184
0.3214 29.0 4524 0.3284 0.7365
0.3214 30.0 4680 0.3290 0.7437
0.3214 31.0 4836 0.3328 0.7256
0.3214 32.0 4992 0.3268 0.7220
0.3167 33.0 5148 0.3372 0.7220
0.3167 34.0 5304 0.3263 0.7256
0.3167 35.0 5460 0.3231 0.7365
0.312 36.0 5616 0.3255 0.7256
0.312 37.0 5772 0.3325 0.7148
0.312 38.0 5928 0.3351 0.7365
0.3083 39.0 6084 0.3362 0.7148
0.3083 40.0 6240 0.3326 0.7292
0.3083 41.0 6396 0.3366 0.7220
0.3081 42.0 6552 0.3265 0.7292
0.3081 43.0 6708 0.3351 0.7365
0.3081 44.0 6864 0.3384 0.7329
0.3032 45.0 7020 0.3298 0.7220
0.3032 46.0 7176 0.3309 0.7329
0.3032 47.0 7332 0.3319 0.7256
0.3032 48.0 7488 0.3452 0.7401
0.2998 49.0 7644 0.3365 0.7365
0.2998 50.0 7800 0.3290 0.7256
0.2998 51.0 7956 0.3251 0.7509
0.2989 52.0 8112 0.3254 0.7401
0.2989 53.0 8268 0.3372 0.7365
0.2989 54.0 8424 0.3401 0.7437
0.2951 55.0 8580 0.3315 0.7365
0.2951 56.0 8736 0.3345 0.7292
0.2951 57.0 8892 0.3301 0.7292
0.2945 58.0 9048 0.3322 0.7292
0.2945 59.0 9204 0.3305 0.7329
0.2945 60.0 9360 0.3316 0.7365

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
1

Dataset used to train Onutoa/20230817181727