Edit model card

20230824104542

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3421
  • Accuracy: 0.7256

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.003
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 312 1.0891 0.5307
0.5902 2.0 624 0.6221 0.4765
0.5902 3.0 936 0.4801 0.5379
0.5511 4.0 1248 0.4461 0.5054
0.5299 5.0 1560 0.5922 0.5162
0.5299 6.0 1872 0.4113 0.5199
0.509 7.0 2184 0.4885 0.5451
0.509 8.0 2496 0.4106 0.4910
0.4976 9.0 2808 0.5019 0.4874
0.4898 10.0 3120 0.4132 0.5307
0.4898 11.0 3432 0.4564 0.4874
0.4739 12.0 3744 0.4919 0.5307
0.4594 13.0 4056 0.4235 0.4982
0.4594 14.0 4368 0.3937 0.5812
0.4444 15.0 4680 0.3871 0.5812
0.4444 16.0 4992 0.4123 0.6065
0.4334 17.0 5304 0.3986 0.6209
0.4045 18.0 5616 0.4088 0.6029
0.4045 19.0 5928 0.3935 0.6209
0.3999 20.0 6240 0.3645 0.6715
0.376 21.0 6552 0.4230 0.5740
0.376 22.0 6864 0.3911 0.6823
0.3683 23.0 7176 0.5057 0.6534
0.3683 24.0 7488 0.3273 0.7040
0.3501 25.0 7800 0.3663 0.7004
0.344 26.0 8112 0.3755 0.6931
0.344 27.0 8424 0.3648 0.7112
0.3354 28.0 8736 0.3359 0.7148
0.3288 29.0 9048 0.3362 0.7112
0.3288 30.0 9360 0.5539 0.6787
0.3199 31.0 9672 0.3617 0.7112
0.3199 32.0 9984 0.3601 0.7184
0.3166 33.0 10296 0.3325 0.7292
0.3037 34.0 10608 0.3274 0.7256
0.3037 35.0 10920 0.3412 0.7076
0.2987 36.0 11232 0.3509 0.7256
0.2842 37.0 11544 0.3945 0.7076
0.2842 38.0 11856 0.3224 0.7365
0.2894 39.0 12168 0.4010 0.7148
0.2894 40.0 12480 0.3472 0.7220
0.2764 41.0 12792 0.3364 0.7112
0.2708 42.0 13104 0.3379 0.7040
0.2708 43.0 13416 0.3625 0.7148
0.2665 44.0 13728 0.3435 0.7220
0.265 45.0 14040 0.3762 0.7292
0.265 46.0 14352 0.3322 0.7220
0.2618 47.0 14664 0.3265 0.7329
0.2618 48.0 14976 0.3752 0.7292
0.2513 49.0 15288 0.3415 0.7292
0.2487 50.0 15600 0.3604 0.7220
0.2487 51.0 15912 0.3484 0.7292
0.2488 52.0 16224 0.3598 0.7329
0.2404 53.0 16536 0.3719 0.7184
0.2404 54.0 16848 0.3329 0.7220
0.2359 55.0 17160 0.3535 0.7220
0.2359 56.0 17472 0.3606 0.7256
0.2364 57.0 17784 0.3407 0.7292
0.2343 58.0 18096 0.3342 0.7292
0.2343 59.0 18408 0.3451 0.7220
0.2348 60.0 18720 0.3421 0.7256

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
10

Dataset used to train dkqjrm/20230824104542