Edit model card

20230826022757

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5491
  • Accuracy: 0.74

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.01
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 80.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 25 0.6588 0.44
No log 2.0 50 0.6258 0.63
No log 3.0 75 0.6839 0.66
No log 4.0 100 0.6238 0.63
No log 5.0 125 0.5878 0.64
No log 6.0 150 0.5895 0.61
No log 7.0 175 0.5951 0.63
No log 8.0 200 0.6701 0.62
No log 9.0 225 0.5858 0.62
No log 10.0 250 0.6603 0.64
No log 11.0 275 0.5708 0.65
No log 12.0 300 0.5657 0.63
No log 13.0 325 0.5691 0.68
No log 14.0 350 0.5820 0.67
No log 15.0 375 0.5245 0.7
No log 16.0 400 0.6291 0.7
No log 17.0 425 0.6177 0.7
No log 18.0 450 0.7375 0.7
No log 19.0 475 0.6500 0.68
0.6647 20.0 500 0.6727 0.71
0.6647 21.0 525 0.7042 0.72
0.6647 22.0 550 0.7448 0.71
0.6647 23.0 575 0.6157 0.72
0.6647 24.0 600 0.7661 0.72
0.6647 25.0 625 0.6832 0.72
0.6647 26.0 650 0.6971 0.72
0.6647 27.0 675 0.6274 0.72
0.6647 28.0 700 0.6846 0.73
0.6647 29.0 725 0.6319 0.73
0.6647 30.0 750 0.7387 0.74
0.6647 31.0 775 0.6482 0.74
0.6647 32.0 800 0.6043 0.73
0.6647 33.0 825 0.6589 0.72
0.6647 34.0 850 0.7023 0.74
0.6647 35.0 875 0.6197 0.74
0.6647 36.0 900 0.6325 0.75
0.6647 37.0 925 0.6264 0.75
0.6647 38.0 950 0.6198 0.73
0.6647 39.0 975 0.6239 0.74
0.2917 40.0 1000 0.6072 0.74
0.2917 41.0 1025 0.6354 0.74
0.2917 42.0 1050 0.5724 0.74
0.2917 43.0 1075 0.5799 0.74
0.2917 44.0 1100 0.5863 0.75
0.2917 45.0 1125 0.6033 0.74
0.2917 46.0 1150 0.6735 0.73
0.2917 47.0 1175 0.6068 0.73
0.2917 48.0 1200 0.6064 0.73
0.2917 49.0 1225 0.6205 0.74
0.2917 50.0 1250 0.5605 0.74
0.2917 51.0 1275 0.6015 0.75
0.2917 52.0 1300 0.5771 0.75
0.2917 53.0 1325 0.5400 0.75
0.2917 54.0 1350 0.5911 0.76
0.2917 55.0 1375 0.5665 0.76
0.2917 56.0 1400 0.5658 0.75
0.2917 57.0 1425 0.5775 0.75
0.2917 58.0 1450 0.5690 0.74
0.2917 59.0 1475 0.5689 0.75
0.2234 60.0 1500 0.5793 0.74
0.2234 61.0 1525 0.5490 0.75
0.2234 62.0 1550 0.5899 0.75
0.2234 63.0 1575 0.5612 0.75
0.2234 64.0 1600 0.5451 0.75
0.2234 65.0 1625 0.5690 0.74
0.2234 66.0 1650 0.5391 0.74
0.2234 67.0 1675 0.5607 0.74
0.2234 68.0 1700 0.5451 0.74
0.2234 69.0 1725 0.5675 0.74
0.2234 70.0 1750 0.5486 0.74
0.2234 71.0 1775 0.5502 0.74
0.2234 72.0 1800 0.5445 0.74
0.2234 73.0 1825 0.5577 0.74
0.2234 74.0 1850 0.5533 0.74
0.2234 75.0 1875 0.5534 0.74
0.2234 76.0 1900 0.5549 0.74
0.2234 77.0 1925 0.5495 0.74
0.2234 78.0 1950 0.5492 0.74
0.2234 79.0 1975 0.5488 0.74
0.2032 80.0 2000 0.5491 0.74

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
8
Unable to determine this model’s pipeline type. Check the docs .

Dataset used to train dkqjrm/20230826022757