Edit model card

20230826035341

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6425
  • Accuracy: 0.66

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.01
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 80.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 25 0.7174 0.41
No log 2.0 50 0.7533 0.48
No log 3.0 75 0.7732 0.61
No log 4.0 100 0.6638 0.61
No log 5.0 125 0.6407 0.49
No log 6.0 150 0.7758 0.37
No log 7.0 175 0.5933 0.67
No log 8.0 200 0.6227 0.67
No log 9.0 225 0.6462 0.66
No log 10.0 250 0.6520 0.65
No log 11.0 275 0.5907 0.64
No log 12.0 300 0.6254 0.64
No log 13.0 325 0.6457 0.64
No log 14.0 350 0.5731 0.65
No log 15.0 375 0.6088 0.65
No log 16.0 400 0.5722 0.65
No log 17.0 425 0.6187 0.64
No log 18.0 450 0.6932 0.65
No log 19.0 475 0.6068 0.66
0.7336 20.0 500 0.5740 0.68
0.7336 21.0 525 0.5791 0.66
0.7336 22.0 550 0.7415 0.65
0.7336 23.0 575 0.6275 0.64
0.7336 24.0 600 0.6515 0.65
0.7336 25.0 625 0.6619 0.66
0.7336 26.0 650 0.7296 0.63
0.7336 27.0 675 0.6984 0.65
0.7336 28.0 700 0.7813 0.68
0.7336 29.0 725 0.7499 0.68
0.7336 30.0 750 0.8273 0.66
0.7336 31.0 775 0.7841 0.67
0.7336 32.0 800 0.7399 0.67
0.7336 33.0 825 0.6789 0.67
0.7336 34.0 850 0.7219 0.68
0.7336 35.0 875 0.7323 0.68
0.7336 36.0 900 0.7056 0.69
0.7336 37.0 925 0.6669 0.68
0.7336 38.0 950 0.6746 0.67
0.7336 39.0 975 0.6932 0.69
0.371 40.0 1000 0.6695 0.68
0.371 41.0 1025 0.7091 0.68
0.371 42.0 1050 0.6842 0.65
0.371 43.0 1075 0.6724 0.66
0.371 44.0 1100 0.6938 0.67
0.371 45.0 1125 0.6779 0.67
0.371 46.0 1150 0.6894 0.67
0.371 47.0 1175 0.6746 0.65
0.371 48.0 1200 0.7162 0.67
0.371 49.0 1225 0.6892 0.66
0.371 50.0 1250 0.6888 0.64
0.371 51.0 1275 0.6493 0.67
0.371 52.0 1300 0.6620 0.66
0.371 53.0 1325 0.6613 0.65
0.371 54.0 1350 0.6567 0.66
0.371 55.0 1375 0.6890 0.67
0.371 56.0 1400 0.6884 0.67
0.371 57.0 1425 0.6547 0.66
0.371 58.0 1450 0.6831 0.66
0.371 59.0 1475 0.6529 0.66
0.2458 60.0 1500 0.6793 0.67
0.2458 61.0 1525 0.6769 0.67
0.2458 62.0 1550 0.6766 0.67
0.2458 63.0 1575 0.6511 0.66
0.2458 64.0 1600 0.6574 0.67
0.2458 65.0 1625 0.6445 0.66
0.2458 66.0 1650 0.6468 0.67
0.2458 67.0 1675 0.6413 0.66
0.2458 68.0 1700 0.6591 0.67
0.2458 69.0 1725 0.6374 0.67
0.2458 70.0 1750 0.6688 0.66
0.2458 71.0 1775 0.6512 0.66
0.2458 72.0 1800 0.6465 0.66
0.2458 73.0 1825 0.6602 0.66
0.2458 74.0 1850 0.6482 0.66
0.2458 75.0 1875 0.6434 0.66
0.2458 76.0 1900 0.6523 0.66
0.2458 77.0 1925 0.6502 0.66
0.2458 78.0 1950 0.6447 0.66
0.2458 79.0 1975 0.6427 0.66
0.2218 80.0 2000 0.6425 0.66

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
8
Unable to determine this model’s pipeline type. Check the docs .

Dataset used to train dkqjrm/20230826035341