Edit model card

20230825045636

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4379
  • Accuracy: 0.7690

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.005
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 80.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 156 1.3576 0.5307
No log 2.0 312 0.9952 0.4693
No log 3.0 468 1.0581 0.4765
0.907 4.0 624 0.8017 0.5343
0.907 5.0 780 0.6566 0.5451
0.907 6.0 936 0.5420 0.6245
0.8287 7.0 1092 0.5092 0.6173
0.8287 8.0 1248 0.4948 0.6462
0.8287 9.0 1404 0.4754 0.6895
0.7327 10.0 1560 0.7416 0.6173
0.7327 11.0 1716 1.1722 0.4621
0.7327 12.0 1872 0.5543 0.6895
0.7276 13.0 2028 0.4895 0.6931
0.7276 14.0 2184 0.4304 0.7148
0.7276 15.0 2340 0.4261 0.7401
0.7276 16.0 2496 0.4467 0.6859
0.6207 17.0 2652 0.4700 0.7184
0.6207 18.0 2808 0.6254 0.6751
0.6207 19.0 2964 0.5108 0.7292
0.5699 20.0 3120 0.7519 0.6354
0.5699 21.0 3276 0.4584 0.7184
0.5699 22.0 3432 0.8289 0.6318
0.5829 23.0 3588 0.4071 0.7148
0.5829 24.0 3744 0.4575 0.7365
0.5829 25.0 3900 0.5062 0.6895
0.4913 26.0 4056 0.5308 0.7220
0.4913 27.0 4212 0.4907 0.7473
0.4913 28.0 4368 0.4703 0.7365
0.4679 29.0 4524 0.4244 0.7148
0.4679 30.0 4680 0.4450 0.7365
0.4679 31.0 4836 0.6184 0.6968
0.4679 32.0 4992 0.4378 0.7437
0.4377 33.0 5148 0.4118 0.7437
0.4377 34.0 5304 0.4272 0.7437
0.4377 35.0 5460 0.3998 0.7473
0.4076 36.0 5616 0.5180 0.7581
0.4076 37.0 5772 0.4967 0.7581
0.4076 38.0 5928 0.4595 0.7437
0.372 39.0 6084 0.5050 0.7329
0.372 40.0 6240 0.3900 0.7401
0.372 41.0 6396 0.4596 0.7545
0.3201 42.0 6552 0.4917 0.7690
0.3201 43.0 6708 0.4171 0.7870
0.3201 44.0 6864 0.4851 0.7256
0.3284 45.0 7020 0.4763 0.7401
0.3284 46.0 7176 0.4541 0.7581
0.3284 47.0 7332 0.4909 0.7509
0.3284 48.0 7488 0.5488 0.7329
0.2809 49.0 7644 0.5422 0.7473
0.2809 50.0 7800 0.4695 0.7653
0.2809 51.0 7956 0.5016 0.7581
0.275 52.0 8112 0.4627 0.7690
0.275 53.0 8268 0.4886 0.7401
0.275 54.0 8424 0.4425 0.7690
0.2456 55.0 8580 0.4289 0.7653
0.2456 56.0 8736 0.4891 0.7545
0.2456 57.0 8892 0.4477 0.7437
0.2328 58.0 9048 0.4510 0.7581
0.2328 59.0 9204 0.5283 0.7581
0.2328 60.0 9360 0.4405 0.7653
0.222 61.0 9516 0.5418 0.7509
0.222 62.0 9672 0.4933 0.7617
0.222 63.0 9828 0.4399 0.7653
0.222 64.0 9984 0.4490 0.7726
0.2174 65.0 10140 0.4820 0.7581
0.2174 66.0 10296 0.4732 0.7726
0.2174 67.0 10452 0.4712 0.7690
0.2075 68.0 10608 0.4847 0.7545
0.2075 69.0 10764 0.4704 0.7509
0.2075 70.0 10920 0.4855 0.7581
0.1987 71.0 11076 0.4845 0.7617
0.1987 72.0 11232 0.4724 0.7617
0.1987 73.0 11388 0.4272 0.7690
0.1845 74.0 11544 0.4324 0.7653
0.1845 75.0 11700 0.4343 0.7726
0.1845 76.0 11856 0.4407 0.7762
0.1835 77.0 12012 0.4185 0.7726
0.1835 78.0 12168 0.4363 0.7762
0.1835 79.0 12324 0.4328 0.7762
0.1835 80.0 12480 0.4379 0.7690

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
7
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train dkqjrm/20230825045636