Edit model card

20230826100510

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5641
  • Accuracy: 0.76

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 80.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 25 0.7263 0.4
No log 2.0 50 0.6115 0.6
No log 3.0 75 0.5427 0.62
No log 4.0 100 0.5319 0.61
No log 5.0 125 0.5818 0.55
No log 6.0 150 0.5093 0.68
No log 7.0 175 0.7841 0.63
No log 8.0 200 0.7629 0.68
No log 9.0 225 0.5874 0.69
No log 10.0 250 0.5228 0.71
No log 11.0 275 0.8439 0.74
No log 12.0 300 0.8243 0.71
No log 13.0 325 0.5670 0.65
No log 14.0 350 0.5601 0.61
No log 15.0 375 0.6452 0.64
No log 16.0 400 0.5239 0.69
No log 17.0 425 0.7315 0.66
No log 18.0 450 0.6651 0.67
No log 19.0 475 0.9040 0.72
1.3727 20.0 500 0.5786 0.73
1.3727 21.0 525 0.7333 0.69
1.3727 22.0 550 0.7584 0.7
1.3727 23.0 575 0.9901 0.71
1.3727 24.0 600 0.5711 0.7
1.3727 25.0 625 0.5870 0.67
1.3727 26.0 650 0.5832 0.7
1.3727 27.0 675 0.9777 0.72
1.3727 28.0 700 0.6448 0.71
1.3727 29.0 725 0.8739 0.71
1.3727 30.0 750 0.6710 0.68
1.3727 31.0 775 0.5919 0.71
1.3727 32.0 800 0.7616 0.7
1.3727 33.0 825 0.5837 0.72
1.3727 34.0 850 1.0103 0.74
1.3727 35.0 875 0.7008 0.73
1.3727 36.0 900 1.0161 0.72
1.3727 37.0 925 0.6911 0.75
1.3727 38.0 950 0.6451 0.75
1.3727 39.0 975 0.7190 0.74
0.7534 40.0 1000 0.5164 0.74
0.7534 41.0 1025 0.4995 0.72
0.7534 42.0 1050 0.5840 0.75
0.7534 43.0 1075 0.7395 0.75
0.7534 44.0 1100 0.6374 0.72
0.7534 45.0 1125 0.7467 0.73
0.7534 46.0 1150 0.6876 0.74
0.7534 47.0 1175 0.5959 0.74
0.7534 48.0 1200 0.5625 0.74
0.7534 49.0 1225 0.6837 0.75
0.7534 50.0 1250 0.6766 0.76
0.7534 51.0 1275 0.6266 0.75
0.7534 52.0 1300 0.6642 0.74
0.7534 53.0 1325 0.6202 0.74
0.7534 54.0 1350 0.6398 0.75
0.7534 55.0 1375 0.6689 0.75
0.7534 56.0 1400 0.6629 0.76
0.7534 57.0 1425 0.5903 0.76
0.7534 58.0 1450 0.6133 0.77
0.7534 59.0 1475 0.6885 0.76
0.4477 60.0 1500 0.5950 0.76
0.4477 61.0 1525 0.5715 0.75
0.4477 62.0 1550 0.6111 0.76
0.4477 63.0 1575 0.6023 0.76
0.4477 64.0 1600 0.5793 0.76
0.4477 65.0 1625 0.5727 0.74
0.4477 66.0 1650 0.5606 0.76
0.4477 67.0 1675 0.5970 0.76
0.4477 68.0 1700 0.5602 0.76
0.4477 69.0 1725 0.5781 0.75
0.4477 70.0 1750 0.6142 0.76
0.4477 71.0 1775 0.5758 0.76
0.4477 72.0 1800 0.5650 0.75
0.4477 73.0 1825 0.5823 0.76
0.4477 74.0 1850 0.5547 0.76
0.4477 75.0 1875 0.5637 0.76
0.4477 76.0 1900 0.5806 0.76
0.4477 77.0 1925 0.5602 0.76
0.4477 78.0 1950 0.5708 0.76
0.4477 79.0 1975 0.5624 0.76
0.3287 80.0 2000 0.5641 0.76

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
8
Unable to determine this model’s pipeline type. Check the docs .

Dataset used to train dkqjrm/20230826100510