Edit model card

20230824043245

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6512
  • Accuracy: 0.7473

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.003
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
1.0514 1.0 623 0.7220 0.5054
0.8415 2.0 1246 0.6761 0.5415
0.925 3.0 1869 0.7140 0.5126
0.8783 4.0 2492 0.6604 0.6245
0.7907 5.0 3115 0.6059 0.6787
0.7756 6.0 3738 0.6058 0.6931
0.7308 7.0 4361 1.0272 0.6173
0.7169 8.0 4984 0.7565 0.6679
0.689 9.0 5607 0.6401 0.7004
0.6368 10.0 6230 0.6674 0.7256
0.5682 11.0 6853 0.5540 0.7148
0.5974 12.0 7476 0.6804 0.7473
0.5286 13.0 8099 0.5929 0.7401
0.5348 14.0 8722 0.7100 0.7220
0.4956 15.0 9345 0.5456 0.7184
0.4654 16.0 9968 0.6426 0.7112
0.4273 17.0 10591 0.6307 0.7365
0.4259 18.0 11214 0.5385 0.7365
0.4454 19.0 11837 0.6045 0.7437
0.4176 20.0 12460 0.7234 0.7401
0.3953 21.0 13083 0.6217 0.7437
0.3847 22.0 13706 0.6348 0.7437
0.3717 23.0 14329 0.8536 0.7148
0.3512 24.0 14952 0.5710 0.7509
0.3237 25.0 15575 0.5594 0.7437
0.3102 26.0 16198 0.7130 0.7581
0.3302 27.0 16821 0.6404 0.7653
0.3066 28.0 17444 0.6608 0.7473
0.305 29.0 18067 0.6181 0.7617
0.2894 30.0 18690 0.7626 0.7329
0.2891 31.0 19313 0.6387 0.7545
0.2836 32.0 19936 0.5889 0.7437
0.2682 33.0 20559 0.7169 0.7473
0.2625 34.0 21182 0.6298 0.7617
0.246 35.0 21805 0.6207 0.7617
0.266 36.0 22428 0.6256 0.7473
0.2398 37.0 23051 0.7504 0.7617
0.2526 38.0 23674 0.6578 0.7473
0.2165 39.0 24297 0.6624 0.7617
0.2347 40.0 24920 0.6133 0.7365
0.2296 41.0 25543 0.6224 0.7509
0.2226 42.0 26166 0.6971 0.7473
0.2214 43.0 26789 0.6280 0.7509
0.2268 44.0 27412 0.6562 0.7473
0.2244 45.0 28035 0.6726 0.7509
0.2067 46.0 28658 0.6554 0.7581
0.1971 47.0 29281 0.5949 0.7581
0.2135 48.0 29904 0.6618 0.7437
0.2012 49.0 30527 0.6752 0.7581
0.1882 50.0 31150 0.6223 0.7581
0.2056 51.0 31773 0.6487 0.7473
0.1993 52.0 32396 0.6544 0.7509
0.197 53.0 33019 0.6673 0.7401
0.1867 54.0 33642 0.6563 0.7437
0.1715 55.0 34265 0.6780 0.7401
0.1787 56.0 34888 0.6906 0.7329
0.19 57.0 35511 0.6606 0.7437
0.1819 58.0 36134 0.6461 0.7437
0.1879 59.0 36757 0.6516 0.7437
0.1773 60.0 37380 0.6512 0.7473

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
6

Dataset used to train dkqjrm/20230824043245