Edit model card

20230824042730

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 1.5547
  • Accuracy: 0.7581

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.003
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
1.1252 1.0 623 0.6915 0.5415
0.9382 2.0 1246 0.7221 0.5307
1.0555 3.0 1869 0.7387 0.5199
0.9336 4.0 2492 0.9751 0.6390
0.8894 5.0 3115 0.9277 0.6643
0.9066 6.0 3738 1.1836 0.6931
0.8496 7.0 4361 0.8242 0.7184
0.7761 8.0 4984 0.9061 0.6859
0.8175 9.0 5607 0.7474 0.7220
0.7575 10.0 6230 0.8582 0.7292
0.747 11.0 6853 0.8351 0.7256
0.728 12.0 7476 0.8912 0.7148
0.8296 13.0 8099 0.9471 0.7220
0.7327 14.0 8722 1.1407 0.7148
0.7284 15.0 9345 0.7681 0.7256
0.6642 16.0 9968 1.4084 0.6679
0.5888 17.0 10591 0.8413 0.7329
0.6074 18.0 11214 0.7461 0.7401
0.625 19.0 11837 0.9516 0.7545
0.5911 20.0 12460 1.3395 0.7292
0.5322 21.0 13083 1.3924 0.7509
0.5247 22.0 13706 1.1553 0.7256
0.5146 23.0 14329 1.6692 0.7040
0.4493 24.0 14952 1.2315 0.7437
0.399 25.0 15575 1.2710 0.7545
0.3644 26.0 16198 1.5049 0.7473
0.4031 27.0 16821 1.5735 0.7401
0.386 28.0 17444 1.4749 0.7220
0.3735 29.0 18067 0.9541 0.7365
0.356 30.0 18690 1.3936 0.7473
0.3496 31.0 19313 0.9982 0.7437
0.3149 32.0 19936 0.9572 0.7581
0.3094 33.0 20559 1.5663 0.7256
0.2886 34.0 21182 1.5993 0.7365
0.2545 35.0 21805 1.1515 0.7545
0.276 36.0 22428 1.2768 0.7473
0.2645 37.0 23051 1.4290 0.7509
0.262 38.0 23674 1.2363 0.7617
0.2261 39.0 24297 1.3446 0.7617
0.2291 40.0 24920 1.0532 0.7509
0.2178 41.0 25543 1.4745 0.7509
0.2104 42.0 26166 1.3830 0.7545
0.217 43.0 26789 1.7099 0.7473
0.214 44.0 27412 1.7054 0.7401
0.1856 45.0 28035 1.4350 0.7545
0.2014 46.0 28658 1.7266 0.7473
0.1759 47.0 29281 1.2659 0.7581
0.2027 48.0 29904 1.8336 0.7401
0.1871 49.0 30527 1.3398 0.7509
0.1586 50.0 31150 1.4948 0.7509
0.1619 51.0 31773 1.3787 0.7545
0.1665 52.0 32396 1.6532 0.7545
0.1786 53.0 33019 1.4697 0.7581
0.1609 54.0 33642 1.5462 0.7653
0.1304 55.0 34265 1.3577 0.7581
0.1576 56.0 34888 1.7004 0.7617
0.1522 57.0 35511 1.4629 0.7581
0.1496 58.0 36134 1.6336 0.7581
0.1406 59.0 36757 1.5699 0.7545
0.1268 60.0 37380 1.5547 0.7581

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
5

Dataset used to train dkqjrm/20230824042730