Edit model card

20230824023516

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7658
  • Accuracy: 0.7401

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.003
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 312 1.0164 0.5307
0.9117 2.0 624 0.7035 0.5090
0.9117 3.0 936 0.6456 0.5307
0.771 4.0 1248 0.6625 0.5487
0.7935 5.0 1560 0.9135 0.5487
0.7935 6.0 1872 0.7048 0.6426
0.7247 7.0 2184 0.7188 0.6570
0.7247 8.0 2496 0.7428 0.6570
0.6659 9.0 2808 0.5639 0.7076
0.6647 10.0 3120 0.8170 0.6426
0.6647 11.0 3432 0.5627 0.7076
0.6248 12.0 3744 0.7036 0.7040
0.5859 13.0 4056 0.5674 0.7112
0.5859 14.0 4368 0.6351 0.7112
0.599 15.0 4680 0.5921 0.7112
0.599 16.0 4992 0.9538 0.6643
0.5515 17.0 5304 0.6401 0.7004
0.5423 18.0 5616 0.5545 0.7256
0.5423 19.0 5928 0.5583 0.7365
0.5248 20.0 6240 0.8808 0.6534
0.4795 21.0 6552 0.5670 0.7292
0.4795 22.0 6864 0.6174 0.6968
0.4853 23.0 7176 0.8153 0.7112
0.4853 24.0 7488 0.6551 0.7256
0.4379 25.0 7800 0.7501 0.7292
0.4365 26.0 8112 0.8488 0.6895
0.4365 27.0 8424 0.7814 0.7112
0.4204 28.0 8736 0.7393 0.7220
0.434 29.0 9048 0.9116 0.6859
0.434 30.0 9360 0.8298 0.7076
0.4064 31.0 9672 0.7928 0.6968
0.4064 32.0 9984 0.6150 0.7329
0.3869 33.0 10296 0.8984 0.7256
0.3459 34.0 10608 0.6598 0.7401
0.3459 35.0 10920 0.6022 0.7401
0.352 36.0 11232 0.8833 0.7112
0.3268 37.0 11544 0.9331 0.7220
0.3268 38.0 11856 0.8233 0.7401
0.3108 39.0 12168 0.8361 0.7329
0.3108 40.0 12480 0.6123 0.7292
0.3038 41.0 12792 0.6187 0.7292
0.287 42.0 13104 0.7216 0.7401
0.287 43.0 13416 0.9118 0.7148
0.2802 44.0 13728 0.8249 0.7329
0.2756 45.0 14040 0.7843 0.7437
0.2756 46.0 14352 0.7272 0.7365
0.2735 47.0 14664 0.7253 0.7292
0.2735 48.0 14976 0.7766 0.7365
0.2552 49.0 15288 0.7906 0.7401
0.2449 50.0 15600 0.6664 0.7329
0.2449 51.0 15912 0.6854 0.7220
0.248 52.0 16224 0.7260 0.7256
0.2533 53.0 16536 0.7750 0.7329
0.2533 54.0 16848 0.7146 0.7401
0.238 55.0 17160 0.7802 0.7365
0.238 56.0 17472 0.7462 0.7365
0.2412 57.0 17784 0.7619 0.7473
0.2241 58.0 18096 0.6815 0.7437
0.2241 59.0 18408 0.7661 0.7401
0.2293 60.0 18720 0.7658 0.7401

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
1

Dataset used to train dkqjrm/20230824023516