20230817153600

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3379
  • Accuracy: 0.7726

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.005
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 312 0.5320 0.5199
0.6084 2.0 624 0.5060 0.5307
0.6084 3.0 936 0.4765 0.4729
0.4786 4.0 1248 0.3862 0.4729
0.5253 5.0 1560 0.5091 0.5343
0.5253 6.0 1872 0.3768 0.4982
0.5144 7.0 2184 0.4406 0.5271
0.5144 8.0 2496 0.3461 0.6318
0.4407 9.0 2808 0.3480 0.6534
0.4002 10.0 3120 0.3629 0.6643
0.4002 11.0 3432 0.3949 0.5560
0.3576 12.0 3744 0.3366 0.7076
0.346 13.0 4056 0.3302 0.7040
0.346 14.0 4368 0.3293 0.7184
0.337 15.0 4680 0.3301 0.7292
0.337 16.0 4992 0.3398 0.7329
0.3323 17.0 5304 0.3555 0.7256
0.3245 18.0 5616 0.3257 0.7040
0.3245 19.0 5928 0.3257 0.7292
0.3243 20.0 6240 0.3507 0.7220
0.3144 21.0 6552 0.4047 0.7184
0.3144 22.0 6864 0.3620 0.7220
0.3135 23.0 7176 0.3740 0.7148
0.3135 24.0 7488 0.3315 0.7437
0.3063 25.0 7800 0.3291 0.7437
0.2986 26.0 8112 0.3626 0.7292
0.2986 27.0 8424 0.3281 0.7401
0.2956 28.0 8736 0.3376 0.7401
0.2927 29.0 9048 0.3310 0.7545
0.2927 30.0 9360 0.3471 0.7437
0.2853 31.0 9672 0.3205 0.7581
0.2853 32.0 9984 0.3271 0.7509
0.2861 33.0 10296 0.3423 0.7509
0.2782 34.0 10608 0.3328 0.7473
0.2782 35.0 10920 0.3289 0.7617
0.2756 36.0 11232 0.3309 0.7581
0.2758 37.0 11544 0.3741 0.7365
0.2758 38.0 11856 0.3326 0.7473
0.2714 39.0 12168 0.3611 0.7184
0.2714 40.0 12480 0.3352 0.7473
0.2687 41.0 12792 0.3405 0.7437
0.2685 42.0 13104 0.3408 0.7365
0.2685 43.0 13416 0.3414 0.7473
0.2649 44.0 13728 0.3369 0.7545
0.2615 45.0 14040 0.3371 0.7545
0.2615 46.0 14352 0.3428 0.7509
0.2602 47.0 14664 0.3286 0.7545
0.2602 48.0 14976 0.3316 0.7581
0.2595 49.0 15288 0.3401 0.7545
0.2551 50.0 15600 0.3362 0.7653
0.2551 51.0 15912 0.3434 0.7653
0.2574 52.0 16224 0.3302 0.7726
0.2515 53.0 16536 0.3464 0.7473
0.2515 54.0 16848 0.3337 0.7690
0.252 55.0 17160 0.3364 0.7690
0.252 56.0 17472 0.3418 0.7509
0.2497 57.0 17784 0.3407 0.7581
0.2503 58.0 18096 0.3419 0.7545
0.2503 59.0 18408 0.3376 0.7762
0.2504 60.0 18720 0.3379 0.7726

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
18
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train Onutoa/20230817153600