Edit model card

20230817010018

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3346
  • Accuracy: 0.6931

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.003
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 312 0.4302 0.4585
0.5241 2.0 624 0.3721 0.5379
0.5241 3.0 936 0.4359 0.4693
0.4404 4.0 1248 0.4139 0.4729
0.444 5.0 1560 0.5513 0.5307
0.444 6.0 1872 0.3854 0.4729
0.4526 7.0 2184 0.3593 0.4729
0.4526 8.0 2496 0.3700 0.5271
0.4555 9.0 2808 0.4814 0.4693
0.4401 10.0 3120 0.4095 0.5271
0.4401 11.0 3432 0.5372 0.5415
0.438 12.0 3744 0.3496 0.5271
0.4381 13.0 4056 0.5447 0.5415
0.4381 14.0 4368 0.4662 0.5668
0.4127 15.0 4680 0.3524 0.6282
0.4127 16.0 4992 0.3402 0.6137
0.4123 17.0 5304 0.7254 0.5776
0.4017 18.0 5616 0.3577 0.5632
0.4017 19.0 5928 0.3274 0.6715
0.3919 20.0 6240 0.3557 0.6173
0.3628 21.0 6552 0.3646 0.4946
0.3628 22.0 6864 0.3489 0.5993
0.3556 23.0 7176 0.4147 0.6354
0.3556 24.0 7488 0.3447 0.6931
0.3508 25.0 7800 0.3240 0.6931
0.3419 26.0 8112 0.3411 0.6751
0.3419 27.0 8424 0.3374 0.6931
0.3398 28.0 8736 0.3280 0.6751
0.3426 29.0 9048 0.3681 0.6968
0.3426 30.0 9360 0.3634 0.6823
0.337 31.0 9672 0.3663 0.6570
0.337 32.0 9984 0.3359 0.6931
0.3369 33.0 10296 0.3239 0.6823
0.3335 34.0 10608 0.3313 0.7076
0.3335 35.0 10920 0.3246 0.7040
0.3307 36.0 11232 0.3624 0.6859
0.329 37.0 11544 0.3669 0.6823
0.329 38.0 11856 0.3467 0.7040
0.3287 39.0 12168 0.3498 0.6968
0.3287 40.0 12480 0.3408 0.6931
0.3264 41.0 12792 0.3236 0.7004
0.324 42.0 13104 0.3363 0.7112
0.324 43.0 13416 0.3384 0.6859
0.3244 44.0 13728 0.3388 0.6895
0.3226 45.0 14040 0.3335 0.7004
0.3226 46.0 14352 0.3314 0.7040
0.3222 47.0 14664 0.3278 0.7148
0.3222 48.0 14976 0.3407 0.6931
0.3186 49.0 15288 0.3328 0.7112
0.3183 50.0 15600 0.3363 0.7076
0.3183 51.0 15912 0.3318 0.7040
0.3153 52.0 16224 0.3305 0.7004
0.3152 53.0 16536 0.3502 0.6751
0.3152 54.0 16848 0.3396 0.6823
0.3144 55.0 17160 0.3282 0.7112
0.3144 56.0 17472 0.3449 0.6823
0.3134 57.0 17784 0.3301 0.7148
0.312 58.0 18096 0.3348 0.6931
0.312 59.0 18408 0.3352 0.6931
0.3118 60.0 18720 0.3346 0.6931

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
11
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train Onutoa/20230817010018