Edit model card

20230826065621

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6391
  • Accuracy: 0.67

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.02
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 80.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 25 0.9872 0.34
No log 2.0 50 0.8547 0.59
No log 3.0 75 0.6062 0.64
No log 4.0 100 0.6097 0.61
No log 5.0 125 0.6064 0.62
No log 6.0 150 0.5974 0.63
No log 7.0 175 0.5723 0.66
No log 8.0 200 0.6179 0.63
No log 9.0 225 0.5842 0.62
No log 10.0 250 0.6117 0.68
No log 11.0 275 0.5444 0.64
No log 12.0 300 0.7898 0.68
No log 13.0 325 0.6851 0.68
No log 14.0 350 0.7716 0.69
No log 15.0 375 0.6750 0.71
No log 16.0 400 0.7645 0.7
No log 17.0 425 0.7338 0.7
No log 18.0 450 0.8156 0.66
No log 19.0 475 0.7524 0.68
0.7431 20.0 500 0.8516 0.65
0.7431 21.0 525 0.8224 0.65
0.7431 22.0 550 1.0607 0.67
0.7431 23.0 575 0.8977 0.66
0.7431 24.0 600 0.7860 0.66
0.7431 25.0 625 0.7285 0.66
0.7431 26.0 650 0.7097 0.64
0.7431 27.0 675 0.7292 0.64
0.7431 28.0 700 0.7131 0.65
0.7431 29.0 725 0.8039 0.65
0.7431 30.0 750 0.7988 0.65
0.7431 31.0 775 0.7809 0.64
0.7431 32.0 800 0.7544 0.64
0.7431 33.0 825 0.7492 0.62
0.7431 34.0 850 0.8206 0.64
0.7431 35.0 875 0.6409 0.66
0.7431 36.0 900 0.7144 0.63
0.7431 37.0 925 0.7414 0.63
0.7431 38.0 950 0.7423 0.65
0.7431 39.0 975 0.7766 0.65
0.3363 40.0 1000 0.7182 0.67
0.3363 41.0 1025 0.7375 0.67
0.3363 42.0 1050 0.7236 0.67
0.3363 43.0 1075 0.7218 0.66
0.3363 44.0 1100 0.7324 0.67
0.3363 45.0 1125 0.7291 0.67
0.3363 46.0 1150 0.6803 0.67
0.3363 47.0 1175 0.6637 0.67
0.3363 48.0 1200 0.7064 0.65
0.3363 49.0 1225 0.6534 0.65
0.3363 50.0 1250 0.7230 0.67
0.3363 51.0 1275 0.7338 0.65
0.3363 52.0 1300 0.6495 0.62
0.3363 53.0 1325 0.6540 0.63
0.3363 54.0 1350 0.6994 0.62
0.3363 55.0 1375 0.7040 0.63
0.3363 56.0 1400 0.6775 0.63
0.3363 57.0 1425 0.6425 0.65
0.3363 58.0 1450 0.6424 0.66
0.3363 59.0 1475 0.6782 0.66
0.2375 60.0 1500 0.6770 0.68
0.2375 61.0 1525 0.7029 0.68
0.2375 62.0 1550 0.6824 0.68
0.2375 63.0 1575 0.6847 0.68
0.2375 64.0 1600 0.6767 0.68
0.2375 65.0 1625 0.6362 0.67
0.2375 66.0 1650 0.6292 0.67
0.2375 67.0 1675 0.6470 0.67
0.2375 68.0 1700 0.6661 0.67
0.2375 69.0 1725 0.6305 0.67
0.2375 70.0 1750 0.6492 0.67
0.2375 71.0 1775 0.6525 0.67
0.2375 72.0 1800 0.6339 0.67
0.2375 73.0 1825 0.6621 0.67
0.2375 74.0 1850 0.6562 0.67
0.2375 75.0 1875 0.6397 0.67
0.2375 76.0 1900 0.6496 0.67
0.2375 77.0 1925 0.6402 0.67
0.2375 78.0 1950 0.6382 0.67
0.2375 79.0 1975 0.6407 0.67
0.2102 80.0 2000 0.6391 0.67

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
8
Unable to determine this model’s pipeline type. Check the docs .

Dataset used to train dkqjrm/20230826065621