Edit model card

20230823213528

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3485
  • Accuracy: 0.7040

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.003
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 312 0.4891 0.5343
0.5805 2.0 624 0.4199 0.5307
0.5805 3.0 936 0.4131 0.4946
0.4984 4.0 1248 0.3933 0.5812
0.4838 5.0 1560 0.4843 0.4801
0.4838 6.0 1872 0.3661 0.6354
0.4855 7.0 2184 0.5478 0.5487
0.4855 8.0 2496 0.3429 0.6534
0.4609 9.0 2808 0.4357 0.5451
0.4554 10.0 3120 0.3549 0.6570
0.4554 11.0 3432 0.5188 0.6354
0.4273 12.0 3744 0.3284 0.6859
0.4039 13.0 4056 0.3282 0.7148
0.4039 14.0 4368 0.3409 0.6968
0.3708 15.0 4680 0.3288 0.6859
0.3708 16.0 4992 0.3508 0.6859
0.3474 17.0 5304 0.3127 0.7220
0.3321 18.0 5616 0.3528 0.6462
0.3321 19.0 5928 0.3202 0.7256
0.3264 20.0 6240 0.3531 0.6787
0.3029 21.0 6552 0.3314 0.7220
0.3029 22.0 6864 0.4123 0.6606
0.3003 23.0 7176 0.3465 0.7148
0.3003 24.0 7488 0.3219 0.7256
0.29 25.0 7800 0.3582 0.7220
0.2752 26.0 8112 0.3376 0.6968
0.2752 27.0 8424 0.3076 0.7509
0.2765 28.0 8736 0.3248 0.7292
0.2703 29.0 9048 0.3493 0.7256
0.2703 30.0 9360 0.3761 0.7112
0.2587 31.0 9672 0.3380 0.7256
0.2587 32.0 9984 0.3229 0.7220
0.2473 33.0 10296 0.3595 0.7112
0.2386 34.0 10608 0.3214 0.7184
0.2386 35.0 10920 0.3223 0.7365
0.2404 36.0 11232 0.3340 0.7329
0.2324 37.0 11544 0.3969 0.6931
0.2324 38.0 11856 0.3440 0.7365
0.2322 39.0 12168 0.3877 0.7076
0.2322 40.0 12480 0.3323 0.7148
0.2221 41.0 12792 0.3317 0.7112
0.2219 42.0 13104 0.3266 0.7076
0.2219 43.0 13416 0.3580 0.7184
0.2132 44.0 13728 0.3492 0.7148
0.2124 45.0 14040 0.3434 0.7184
0.2124 46.0 14352 0.3437 0.7112
0.2063 47.0 14664 0.3438 0.7004
0.2063 48.0 14976 0.3499 0.7184
0.2044 49.0 15288 0.3562 0.7148
0.1997 50.0 15600 0.3468 0.7076
0.1997 51.0 15912 0.3461 0.7112
0.1976 52.0 16224 0.3338 0.7076
0.2001 53.0 16536 0.3390 0.7112
0.2001 54.0 16848 0.3453 0.7040
0.1956 55.0 17160 0.3300 0.7076
0.1956 56.0 17472 0.3610 0.7004
0.1916 57.0 17784 0.3434 0.7040
0.1892 58.0 18096 0.3402 0.7076
0.1892 59.0 18408 0.3489 0.7112
0.1921 60.0 18720 0.3485 0.7040

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
5

Dataset used to train dkqjrm/20230823213528