20230816190102

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3431
  • Accuracy: 0.7004

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.005
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 312 0.5884 0.5235
0.6001 2.0 624 0.4145 0.4729
0.6001 3.0 936 0.6337 0.4729
0.5343 4.0 1248 0.3934 0.4838
0.5255 5.0 1560 0.5662 0.4729
0.5255 6.0 1872 0.5158 0.5271
0.504 7.0 2184 0.3480 0.5343
0.504 8.0 2496 0.3846 0.5379
0.4941 9.0 2808 0.5111 0.5307
0.5022 10.0 3120 0.4621 0.5271
0.5022 11.0 3432 0.3418 0.6426
0.453 12.0 3744 0.3652 0.5632
0.3879 13.0 4056 0.3451 0.5596
0.3879 14.0 4368 0.3312 0.6426
0.3698 15.0 4680 0.3599 0.6462
0.3698 16.0 4992 0.3947 0.5993
0.3705 17.0 5304 0.3833 0.6173
0.3598 18.0 5616 0.3354 0.6462
0.3598 19.0 5928 0.3395 0.6715
0.3631 20.0 6240 0.3664 0.6390
0.3515 21.0 6552 0.3420 0.6787
0.3515 22.0 6864 0.3483 0.6137
0.3486 23.0 7176 0.3820 0.6498
0.3486 24.0 7488 0.3240 0.7004
0.3437 25.0 7800 0.3300 0.7148
0.3389 26.0 8112 0.3405 0.6787
0.3389 27.0 8424 0.3291 0.6968
0.3363 28.0 8736 0.3338 0.6895
0.3381 29.0 9048 0.3366 0.7220
0.3381 30.0 9360 0.3831 0.6606
0.3302 31.0 9672 0.3300 0.7040
0.3302 32.0 9984 0.3224 0.7040
0.33 33.0 10296 0.3332 0.6787
0.3271 34.0 10608 0.3412 0.7256
0.3271 35.0 10920 0.3197 0.7076
0.3266 36.0 11232 0.3236 0.7148
0.3248 37.0 11544 0.3621 0.6751
0.3248 38.0 11856 0.3330 0.7040
0.3223 39.0 12168 0.3636 0.6823
0.3223 40.0 12480 0.3298 0.7076
0.3205 41.0 12792 0.3224 0.7148
0.3177 42.0 13104 0.3288 0.7256
0.3177 43.0 13416 0.3464 0.6823
0.3167 44.0 13728 0.3567 0.6787
0.3159 45.0 14040 0.3551 0.6895
0.3159 46.0 14352 0.3313 0.7112
0.3131 47.0 14664 0.3233 0.7292
0.3131 48.0 14976 0.3508 0.6751
0.3118 49.0 15288 0.3420 0.7040
0.3088 50.0 15600 0.3410 0.6968
0.3088 51.0 15912 0.3421 0.7040
0.3082 52.0 16224 0.3411 0.7040
0.3068 53.0 16536 0.3616 0.6823
0.3068 54.0 16848 0.3555 0.6715
0.3031 55.0 17160 0.3418 0.7004
0.3031 56.0 17472 0.3460 0.6859
0.3039 57.0 17784 0.3353 0.7148
0.3025 58.0 18096 0.3450 0.7004
0.3025 59.0 18408 0.3427 0.7040
0.3034 60.0 18720 0.3431 0.7004

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
12
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train Onutoa/20230816190102