Edit model card

20230819211604

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3362
  • Accuracy: 0.7473

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.004
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 312 0.4002 0.5307
0.545 2.0 624 0.4058 0.5379
0.545 3.0 936 0.3972 0.5379
0.4698 4.0 1248 0.4360 0.4729
0.4785 5.0 1560 0.3494 0.5090
0.4785 6.0 1872 0.4100 0.4729
0.4322 7.0 2184 0.5717 0.5307
0.4322 8.0 2496 0.4078 0.5379
0.3946 9.0 2808 0.3304 0.6570
0.36 10.0 3120 0.3318 0.6426
0.36 11.0 3432 0.3275 0.6931
0.3478 12.0 3744 0.3314 0.7148
0.3359 13.0 4056 0.3277 0.7112
0.3359 14.0 4368 0.3307 0.7148
0.3249 15.0 4680 0.3245 0.6968
0.3249 16.0 4992 0.3626 0.6498
0.3253 17.0 5304 0.3567 0.6859
0.3155 18.0 5616 0.3279 0.7112
0.3155 19.0 5928 0.3257 0.7256
0.3145 20.0 6240 0.3337 0.7112
0.3051 21.0 6552 0.3289 0.7365
0.3051 22.0 6864 0.3523 0.6931
0.3015 23.0 7176 0.3459 0.7040
0.3015 24.0 7488 0.3323 0.7076
0.2952 25.0 7800 0.3445 0.7329
0.289 26.0 8112 0.3554 0.7329
0.289 27.0 8424 0.3210 0.7292
0.2876 28.0 8736 0.3204 0.7365
0.2862 29.0 9048 0.3374 0.7509
0.2862 30.0 9360 0.3778 0.7112
0.2814 31.0 9672 0.3352 0.7401
0.2814 32.0 9984 0.3251 0.7256
0.2777 33.0 10296 0.3574 0.7617
0.2698 34.0 10608 0.3330 0.7292
0.2698 35.0 10920 0.3388 0.7220
0.2714 36.0 11232 0.3222 0.7329
0.2695 37.0 11544 0.3482 0.7473
0.2695 38.0 11856 0.3447 0.7437
0.2637 39.0 12168 0.3394 0.7401
0.2637 40.0 12480 0.3264 0.7401
0.2646 41.0 12792 0.3311 0.7401
0.2613 42.0 13104 0.3322 0.7365
0.2613 43.0 13416 0.3411 0.7473
0.2539 44.0 13728 0.3298 0.7581
0.2543 45.0 14040 0.3442 0.7437
0.2543 46.0 14352 0.3399 0.7545
0.2516 47.0 14664 0.3330 0.7473
0.2516 48.0 14976 0.3299 0.7473
0.2509 49.0 15288 0.3407 0.7401
0.2484 50.0 15600 0.3268 0.7581
0.2484 51.0 15912 0.3386 0.7509
0.2491 52.0 16224 0.3323 0.7581
0.2483 53.0 16536 0.3448 0.7473
0.2483 54.0 16848 0.3339 0.7545
0.2452 55.0 17160 0.3343 0.7473
0.2452 56.0 17472 0.3408 0.7509
0.2456 57.0 17784 0.3374 0.7545
0.2429 58.0 18096 0.3360 0.7473
0.2429 59.0 18408 0.3345 0.7545
0.2436 60.0 18720 0.3362 0.7473

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
1

Dataset used to train Onutoa/20230819211604