Edit model card

20230824063515

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2971
  • Accuracy: 0.7437

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.003
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 312 0.5375 0.5307
0.6046 2.0 624 0.6540 0.4729
0.6046 3.0 936 0.4055 0.5415
0.5378 4.0 1248 0.3920 0.5957
0.5028 5.0 1560 0.4366 0.5921
0.5028 6.0 1872 0.3927 0.6498
0.4686 7.0 2184 0.4005 0.6715
0.4686 8.0 2496 0.3381 0.6643
0.434 9.0 2808 0.3351 0.6679
0.4165 10.0 3120 0.4170 0.6282
0.4165 11.0 3432 0.4045 0.6462
0.4099 12.0 3744 0.4218 0.6895
0.3978 13.0 4056 0.3215 0.7184
0.3978 14.0 4368 0.3361 0.7256
0.3771 15.0 4680 0.4252 0.6426
0.3771 16.0 4992 0.3370 0.7148
0.3682 17.0 5304 0.7211 0.6498
0.3718 18.0 5616 0.3221 0.7004
0.3718 19.0 5928 0.3008 0.7220
0.3568 20.0 6240 0.3129 0.7256
0.325 21.0 6552 0.5513 0.6895
0.325 22.0 6864 0.3316 0.7040
0.3157 23.0 7176 0.4315 0.6968
0.3157 24.0 7488 0.3027 0.7545
0.2914 25.0 7800 0.3060 0.7545
0.2811 26.0 8112 0.3481 0.7365
0.2811 27.0 8424 0.3148 0.7401
0.2657 28.0 8736 0.3024 0.7401
0.265 29.0 9048 0.3254 0.7509
0.265 30.0 9360 0.3451 0.7437
0.2535 31.0 9672 0.3132 0.7545
0.2535 32.0 9984 0.2981 0.7365
0.2507 33.0 10296 0.3338 0.7617
0.2397 34.0 10608 0.3275 0.7365
0.2397 35.0 10920 0.3021 0.7401
0.2379 36.0 11232 0.3322 0.7401
0.2247 37.0 11544 0.3617 0.7329
0.2247 38.0 11856 0.3050 0.7437
0.2291 39.0 12168 0.3189 0.7401
0.2291 40.0 12480 0.2946 0.7473
0.2187 41.0 12792 0.2927 0.7365
0.2175 42.0 13104 0.3130 0.7401
0.2175 43.0 13416 0.2942 0.7365
0.2161 44.0 13728 0.3026 0.7437
0.2072 45.0 14040 0.3566 0.7329
0.2072 46.0 14352 0.2972 0.7437
0.2086 47.0 14664 0.2904 0.7365
0.2086 48.0 14976 0.2961 0.7473
0.2037 49.0 15288 0.3246 0.7473
0.1989 50.0 15600 0.2906 0.7473
0.1989 51.0 15912 0.2876 0.7401
0.2034 52.0 16224 0.3103 0.7437
0.2003 53.0 16536 0.3022 0.7617
0.2003 54.0 16848 0.3022 0.7437
0.1962 55.0 17160 0.2962 0.7365
0.1962 56.0 17472 0.2996 0.7473
0.195 57.0 17784 0.3006 0.7437
0.191 58.0 18096 0.2879 0.7401
0.191 59.0 18408 0.2972 0.7473
0.1946 60.0 18720 0.2971 0.7437

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
1

Dataset used to train dkqjrm/20230824063515