Edit model card

20230822185017

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3476
  • Accuracy: 0.7076

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.003
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 312 0.3644 0.5271
0.5253 2.0 624 0.3757 0.5632
0.5253 3.0 936 0.3595 0.4874
0.4289 4.0 1248 0.4613 0.5415
0.4182 5.0 1560 0.3427 0.6137
0.4182 6.0 1872 0.3880 0.4874
0.4027 7.0 2184 0.4778 0.5487
0.4027 8.0 2496 0.3335 0.6715
0.4009 9.0 2808 0.4011 0.5523
0.3781 10.0 3120 0.3286 0.7040
0.3781 11.0 3432 0.4135 0.6101
0.3679 12.0 3744 0.3368 0.6787
0.3774 13.0 4056 0.3311 0.6787
0.3774 14.0 4368 0.3223 0.6859
0.3457 15.0 4680 0.3293 0.7076
0.3457 16.0 4992 0.4108 0.5812
0.3607 17.0 5304 0.3682 0.6534
0.3436 18.0 5616 0.3374 0.6498
0.3436 19.0 5928 0.3248 0.7148
0.3236 20.0 6240 0.3447 0.7184
0.3022 21.0 6552 0.3444 0.7148
0.3022 22.0 6864 0.3790 0.6643
0.2938 23.0 7176 0.3575 0.6968
0.2938 24.0 7488 0.3321 0.7112
0.2837 25.0 7800 0.3570 0.7076
0.2783 26.0 8112 0.3716 0.6426
0.2783 27.0 8424 0.3534 0.7040
0.2693 28.0 8736 0.3435 0.7004
0.2654 29.0 9048 0.3371 0.6968
0.2654 30.0 9360 0.3610 0.6787
0.2598 31.0 9672 0.3277 0.7220
0.2598 32.0 9984 0.3412 0.7076
0.257 33.0 10296 0.3389 0.7040
0.2484 34.0 10608 0.3424 0.6968
0.2484 35.0 10920 0.3671 0.7112
0.2446 36.0 11232 0.3492 0.7148
0.2449 37.0 11544 0.3485 0.7148
0.2449 38.0 11856 0.3413 0.7148
0.2414 39.0 12168 0.3373 0.7004
0.2414 40.0 12480 0.3415 0.7220
0.2377 41.0 12792 0.3434 0.6931
0.2353 42.0 13104 0.3612 0.7040
0.2353 43.0 13416 0.3516 0.7112
0.2347 44.0 13728 0.3430 0.7112
0.2357 45.0 14040 0.3455 0.7004
0.2357 46.0 14352 0.3480 0.7040
0.2306 47.0 14664 0.3580 0.7112
0.2306 48.0 14976 0.3636 0.7040
0.2304 49.0 15288 0.3483 0.7112
0.2295 50.0 15600 0.3529 0.7004
0.2295 51.0 15912 0.3498 0.7040
0.2296 52.0 16224 0.3501 0.7220
0.2285 53.0 16536 0.3474 0.7076
0.2285 54.0 16848 0.3444 0.7076
0.2276 55.0 17160 0.3404 0.7004
0.2276 56.0 17472 0.3500 0.6895
0.2278 57.0 17784 0.3507 0.7040
0.2264 58.0 18096 0.3468 0.7040
0.2264 59.0 18408 0.3522 0.7040
0.2265 60.0 18720 0.3476 0.7076

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
1

Dataset used to train dkqjrm/20230822185017