Edit model card

20230825183854

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3677
  • Accuracy: 0.7329

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.005
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 80.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 156 0.6538 0.5307
No log 2.0 312 0.6933 0.5162
No log 3.0 468 0.7141 0.4585
0.8733 4.0 624 0.6298 0.5343
0.8733 5.0 780 0.6732 0.5343
0.8733 6.0 936 0.5740 0.6137
0.8394 7.0 1092 0.7296 0.5632
0.8394 8.0 1248 0.8035 0.5668
0.8394 9.0 1404 0.6425 0.6209
0.7591 10.0 1560 0.4622 0.6643
0.7591 11.0 1716 0.4437 0.6859
0.7591 12.0 1872 0.4827 0.6787
0.6772 13.0 2028 0.5774 0.6715
0.6772 14.0 2184 0.4063 0.7112
0.6772 15.0 2340 0.5000 0.6498
0.6772 16.0 2496 0.4834 0.6570
0.6497 17.0 2652 0.5429 0.6931
0.6497 18.0 2808 0.4595 0.7148
0.6497 19.0 2964 0.3976 0.6787
0.6063 20.0 3120 0.3676 0.7004
0.6063 21.0 3276 0.4152 0.7329
0.6063 22.0 3432 0.4491 0.6643
0.5763 23.0 3588 0.4205 0.6968
0.5763 24.0 3744 0.3677 0.7112
0.5763 25.0 3900 0.4396 0.6606
0.5433 26.0 4056 0.3519 0.7292
0.5433 27.0 4212 0.4936 0.7329
0.5433 28.0 4368 0.5706 0.6209
0.5217 29.0 4524 0.5359 0.6643
0.5217 30.0 4680 0.3722 0.7256
0.5217 31.0 4836 0.4510 0.6498
0.5217 32.0 4992 0.4153 0.7076
0.4772 33.0 5148 0.4060 0.7292
0.4772 34.0 5304 0.4248 0.7112
0.4772 35.0 5460 0.3862 0.7184
0.46 36.0 5616 0.4376 0.6715
0.46 37.0 5772 0.4369 0.6751
0.46 38.0 5928 0.3735 0.7112
0.4145 39.0 6084 0.3600 0.7256
0.4145 40.0 6240 0.3753 0.7401
0.4145 41.0 6396 0.4377 0.7437
0.4086 42.0 6552 0.4095 0.7509
0.4086 43.0 6708 0.4555 0.7112
0.4086 44.0 6864 0.4092 0.7365
0.3716 45.0 7020 0.4073 0.6968
0.3716 46.0 7176 0.4190 0.7220
0.3716 47.0 7332 0.4445 0.7617
0.3716 48.0 7488 0.4113 0.7112
0.3526 49.0 7644 0.4075 0.7184
0.3526 50.0 7800 0.3924 0.7437
0.3526 51.0 7956 0.3993 0.7184
0.3175 52.0 8112 0.4196 0.7292
0.3175 53.0 8268 0.4894 0.6931
0.3175 54.0 8424 0.4043 0.7256
0.3204 55.0 8580 0.4841 0.6895
0.3204 56.0 8736 0.3880 0.7220
0.3204 57.0 8892 0.5248 0.7040
0.3093 58.0 9048 0.3957 0.7220
0.3093 59.0 9204 0.4407 0.7292
0.3093 60.0 9360 0.3696 0.7292
0.3068 61.0 9516 0.3891 0.7148
0.3068 62.0 9672 0.4251 0.7220
0.3068 63.0 9828 0.4027 0.7509
0.3068 64.0 9984 0.3926 0.7329
0.2853 65.0 10140 0.3853 0.7329
0.2853 66.0 10296 0.3718 0.7329
0.2853 67.0 10452 0.3739 0.7401
0.2705 68.0 10608 0.3705 0.7653
0.2705 69.0 10764 0.3788 0.7365
0.2705 70.0 10920 0.3832 0.7329
0.2643 71.0 11076 0.3846 0.7509
0.2643 72.0 11232 0.3731 0.7545
0.2643 73.0 11388 0.3909 0.7329
0.2604 74.0 11544 0.3711 0.7437
0.2604 75.0 11700 0.3693 0.7437
0.2604 76.0 11856 0.3797 0.7292
0.2573 77.0 12012 0.3761 0.7329
0.2573 78.0 12168 0.3799 0.7220
0.2573 79.0 12324 0.3657 0.7473
0.2573 80.0 12480 0.3677 0.7329

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
7
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train dkqjrm/20230825183854