Edit model card

20230822163753

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3363
  • Accuracy: 0.7256

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.003
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 312 0.6253 0.5307
0.4958 2.0 624 0.3817 0.5415
0.4958 3.0 936 0.5426 0.4729
0.4406 4.0 1248 0.7363 0.5379
0.4205 5.0 1560 0.3395 0.6498
0.4205 6.0 1872 0.3422 0.6354
0.4134 7.0 2184 0.4093 0.5487
0.4134 8.0 2496 0.4435 0.5487
0.4124 9.0 2808 0.3364 0.6065
0.3904 10.0 3120 0.3570 0.6029
0.3904 11.0 3432 0.3988 0.5596
0.376 12.0 3744 0.3339 0.6751
0.3501 13.0 4056 0.3348 0.6606
0.3501 14.0 4368 0.3288 0.6715
0.3336 15.0 4680 0.3261 0.6823
0.3336 16.0 4992 0.3326 0.7040
0.333 17.0 5304 0.3264 0.7112
0.3259 18.0 5616 0.3259 0.6968
0.3259 19.0 5928 0.3253 0.6643
0.3281 20.0 6240 0.3261 0.7184
0.3191 21.0 6552 0.3227 0.7220
0.3191 22.0 6864 0.3371 0.6931
0.3164 23.0 7176 0.3522 0.6895
0.3164 24.0 7488 0.3275 0.7040
0.3133 25.0 7800 0.3234 0.7329
0.308 26.0 8112 0.3352 0.6931
0.308 27.0 8424 0.3167 0.7184
0.3075 28.0 8736 0.3378 0.6968
0.3064 29.0 9048 0.3370 0.7112
0.3064 30.0 9360 0.3432 0.7004
0.3021 31.0 9672 0.3305 0.7148
0.3021 32.0 9984 0.3218 0.7220
0.2983 33.0 10296 0.3349 0.7112
0.2933 34.0 10608 0.3208 0.7256
0.2933 35.0 10920 0.3243 0.7220
0.2931 36.0 11232 0.3206 0.7292
0.2903 37.0 11544 0.3643 0.6895
0.2903 38.0 11856 0.3254 0.7473
0.2895 39.0 12168 0.3350 0.7148
0.2895 40.0 12480 0.3325 0.7076
0.2852 41.0 12792 0.3289 0.7256
0.2857 42.0 13104 0.3281 0.7256
0.2857 43.0 13416 0.3373 0.7184
0.2805 44.0 13728 0.3414 0.7040
0.2806 45.0 14040 0.3346 0.7292
0.2806 46.0 14352 0.3383 0.7220
0.2777 47.0 14664 0.3285 0.7220
0.2777 48.0 14976 0.3385 0.7148
0.2768 49.0 15288 0.3403 0.7148
0.2732 50.0 15600 0.3336 0.7256
0.2732 51.0 15912 0.3306 0.7184
0.274 52.0 16224 0.3300 0.7292
0.272 53.0 16536 0.3318 0.7220
0.272 54.0 16848 0.3403 0.7220
0.2701 55.0 17160 0.3252 0.7292
0.2701 56.0 17472 0.3391 0.7220
0.2695 57.0 17784 0.3304 0.7292
0.2694 58.0 18096 0.3300 0.7220
0.2694 59.0 18408 0.3347 0.7292
0.2689 60.0 18720 0.3363 0.7256

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
1

Dataset used to train dkqjrm/20230822163753