Edit model card

20230826065732

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5294
  • Accuracy: 0.67

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.02
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 80.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 25 0.6448 0.4
No log 2.0 50 0.7950 0.65
No log 3.0 75 0.6181 0.54
No log 4.0 100 0.5601 0.6
No log 5.0 125 0.5816 0.42
No log 6.0 150 0.5957 0.43
No log 7.0 175 0.5331 0.61
No log 8.0 200 0.5507 0.61
No log 9.0 225 0.5438 0.62
No log 10.0 250 0.5455 0.65
No log 11.0 275 0.5141 0.65
No log 12.0 300 0.5019 0.71
No log 13.0 325 0.6824 0.7
No log 14.0 350 0.5735 0.73
No log 15.0 375 0.5578 0.69
No log 16.0 400 0.5607 0.72
No log 17.0 425 0.5974 0.71
No log 18.0 450 0.8102 0.71
No log 19.0 475 0.6757 0.73
0.7598 20.0 500 0.5266 0.74
0.7598 21.0 525 0.6271 0.69
0.7598 22.0 550 0.6341 0.7
0.7598 23.0 575 0.6874 0.7
0.7598 24.0 600 0.5264 0.72
0.7598 25.0 625 0.5148 0.73
0.7598 26.0 650 0.5760 0.77
0.7598 27.0 675 0.6581 0.71
0.7598 28.0 700 0.6479 0.71
0.7598 29.0 725 0.6960 0.69
0.7598 30.0 750 0.6919 0.7
0.7598 31.0 775 0.6421 0.68
0.7598 32.0 800 0.5681 0.68
0.7598 33.0 825 0.5631 0.68
0.7598 34.0 850 0.5676 0.66
0.7598 35.0 875 0.5389 0.68
0.7598 36.0 900 0.6267 0.68
0.7598 37.0 925 0.6107 0.65
0.7598 38.0 950 0.5359 0.66
0.7598 39.0 975 0.5741 0.67
0.4266 40.0 1000 0.5928 0.69
0.4266 41.0 1025 0.5307 0.68
0.4266 42.0 1050 0.5909 0.66
0.4266 43.0 1075 0.5733 0.66
0.4266 44.0 1100 0.5561 0.66
0.4266 45.0 1125 0.5600 0.69
0.4266 46.0 1150 0.5228 0.66
0.4266 47.0 1175 0.5383 0.7
0.4266 48.0 1200 0.5643 0.69
0.4266 49.0 1225 0.5493 0.7
0.4266 50.0 1250 0.5576 0.7
0.4266 51.0 1275 0.5543 0.68
0.4266 52.0 1300 0.5615 0.69
0.4266 53.0 1325 0.5358 0.67
0.4266 54.0 1350 0.5405 0.69
0.4266 55.0 1375 0.5327 0.69
0.4266 56.0 1400 0.5645 0.67
0.4266 57.0 1425 0.5240 0.67
0.4266 58.0 1450 0.5402 0.67
0.4266 59.0 1475 0.5495 0.68
0.3249 60.0 1500 0.5624 0.66
0.3249 61.0 1525 0.5513 0.67
0.3249 62.0 1550 0.5537 0.68
0.3249 63.0 1575 0.5444 0.68
0.3249 64.0 1600 0.5553 0.68
0.3249 65.0 1625 0.5221 0.68
0.3249 66.0 1650 0.5136 0.68
0.3249 67.0 1675 0.5231 0.69
0.3249 68.0 1700 0.5305 0.69
0.3249 69.0 1725 0.5278 0.68
0.3249 70.0 1750 0.5440 0.66
0.3249 71.0 1775 0.5411 0.67
0.3249 72.0 1800 0.5346 0.69
0.3249 73.0 1825 0.5241 0.67
0.3249 74.0 1850 0.5425 0.67
0.3249 75.0 1875 0.5213 0.67
0.3249 76.0 1900 0.5405 0.66
0.3249 77.0 1925 0.5251 0.67
0.3249 78.0 1950 0.5300 0.67
0.3249 79.0 1975 0.5285 0.67
0.2946 80.0 2000 0.5294 0.67

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
7
Unable to determine this model’s pipeline type. Check the docs .

Dataset used to train dkqjrm/20230826065732