Edit model card

20230823213605

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6579
  • Accuracy: 0.7365

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.003
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 312 1.6256 0.5307
0.8748 2.0 624 0.7617 0.5523
0.8748 3.0 936 0.6603 0.5271
0.7596 4.0 1248 0.6103 0.6101
0.7685 5.0 1560 0.9349 0.5668
0.7685 6.0 1872 0.8351 0.6101
0.6585 7.0 2184 0.5995 0.6823
0.6585 8.0 2496 0.5553 0.7076
0.651 9.0 2808 0.5718 0.7040
0.629 10.0 3120 0.5922 0.7040
0.629 11.0 3432 0.5775 0.7148
0.6145 12.0 3744 0.5886 0.7292
0.595 13.0 4056 0.5959 0.7076
0.595 14.0 4368 0.5683 0.7040
0.5501 15.0 4680 0.5633 0.7329
0.5501 16.0 4992 0.6229 0.7184
0.5382 17.0 5304 0.8960 0.6643
0.4987 18.0 5616 0.5098 0.7076
0.4987 19.0 5928 0.6151 0.7184
0.5146 20.0 6240 0.6031 0.7329
0.4536 21.0 6552 0.7180 0.7329
0.4536 22.0 6864 0.7608 0.7184
0.45 23.0 7176 0.7551 0.7112
0.45 24.0 7488 0.7242 0.7148
0.4336 25.0 7800 0.7373 0.7292
0.396 26.0 8112 0.7001 0.7220
0.396 27.0 8424 0.6008 0.7365
0.3851 28.0 8736 0.5931 0.7148
0.3699 29.0 9048 0.6664 0.7329
0.3699 30.0 9360 0.6632 0.7473
0.3451 31.0 9672 0.6476 0.7437
0.3451 32.0 9984 0.5929 0.7292
0.3273 33.0 10296 0.7271 0.7292
0.3025 34.0 10608 0.6819 0.7292
0.3025 35.0 10920 0.5734 0.7329
0.2981 36.0 11232 0.7307 0.7256
0.2829 37.0 11544 0.8025 0.7329
0.2829 38.0 11856 0.5696 0.7545
0.2724 39.0 12168 0.6290 0.7401
0.2724 40.0 12480 0.6417 0.7292
0.2604 41.0 12792 0.5523 0.7401
0.253 42.0 13104 0.7210 0.7365
0.253 43.0 13416 0.6005 0.7365
0.2469 44.0 13728 0.6808 0.7473
0.2492 45.0 14040 0.6506 0.7509
0.2492 46.0 14352 0.6687 0.7437
0.2413 47.0 14664 0.6401 0.7329
0.2413 48.0 14976 0.6588 0.7329
0.2356 49.0 15288 0.6625 0.7401
0.2251 50.0 15600 0.6472 0.7292
0.2251 51.0 15912 0.6800 0.7401
0.2207 52.0 16224 0.6191 0.7473
0.2127 53.0 16536 0.6478 0.7365
0.2127 54.0 16848 0.6509 0.7329
0.2217 55.0 17160 0.6644 0.7365
0.2217 56.0 17472 0.6360 0.7365
0.2094 57.0 17784 0.6509 0.7365
0.2045 58.0 18096 0.6445 0.7365
0.2045 59.0 18408 0.6659 0.7365
0.2072 60.0 18720 0.6579 0.7365

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
5

Dataset used to train dkqjrm/20230823213605