Edit model card

20230826130948

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5311
  • Accuracy: 0.65

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 80.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 25 0.5416 0.66
No log 2.0 50 0.5394 0.64
No log 3.0 75 0.5376 0.65
No log 4.0 100 0.5476 0.65
No log 5.0 125 0.5371 0.64
No log 6.0 150 0.5442 0.63
No log 7.0 175 0.5413 0.65
No log 8.0 200 0.5381 0.65
No log 9.0 225 0.5366 0.65
No log 10.0 250 0.5402 0.65
No log 11.0 275 0.5405 0.65
No log 12.0 300 0.5396 0.65
No log 13.0 325 0.5379 0.66
No log 14.0 350 0.5375 0.66
No log 15.0 375 0.5393 0.65
No log 16.0 400 0.5371 0.66
No log 17.0 425 0.5286 0.66
No log 18.0 450 0.5313 0.65
No log 19.0 475 0.5427 0.62
0.616 20.0 500 0.5469 0.63
0.616 21.0 525 0.5348 0.65
0.616 22.0 550 0.5352 0.64
0.616 23.0 575 0.5434 0.63
0.616 24.0 600 0.5437 0.62
0.616 25.0 625 0.5344 0.65
0.616 26.0 650 0.5344 0.66
0.616 27.0 675 0.5319 0.66
0.616 28.0 700 0.5329 0.66
0.616 29.0 725 0.5313 0.66
0.616 30.0 750 0.5321 0.66
0.616 31.0 775 0.5342 0.65
0.616 32.0 800 0.5364 0.66
0.616 33.0 825 0.5350 0.65
0.616 34.0 850 0.5382 0.65
0.616 35.0 875 0.5330 0.65
0.616 36.0 900 0.5361 0.64
0.616 37.0 925 0.5379 0.63
0.616 38.0 950 0.5314 0.64
0.616 39.0 975 0.5308 0.65
0.6054 40.0 1000 0.5348 0.65
0.6054 41.0 1025 0.5374 0.64
0.6054 42.0 1050 0.5363 0.64
0.6054 43.0 1075 0.5361 0.64
0.6054 44.0 1100 0.5333 0.65
0.6054 45.0 1125 0.5346 0.65
0.6054 46.0 1150 0.5354 0.65
0.6054 47.0 1175 0.5338 0.64
0.6054 48.0 1200 0.5332 0.65
0.6054 49.0 1225 0.5334 0.65
0.6054 50.0 1250 0.5361 0.65
0.6054 51.0 1275 0.5311 0.65
0.6054 52.0 1300 0.5332 0.66
0.6054 53.0 1325 0.5312 0.65
0.6054 54.0 1350 0.5334 0.65
0.6054 55.0 1375 0.5306 0.66
0.6054 56.0 1400 0.5326 0.65
0.6054 57.0 1425 0.5336 0.65
0.6054 58.0 1450 0.5361 0.65
0.6054 59.0 1475 0.5359 0.63
0.5996 60.0 1500 0.5342 0.65
0.5996 61.0 1525 0.5346 0.66
0.5996 62.0 1550 0.5333 0.64
0.5996 63.0 1575 0.5322 0.65
0.5996 64.0 1600 0.5307 0.65
0.5996 65.0 1625 0.5298 0.65
0.5996 66.0 1650 0.5300 0.65
0.5996 67.0 1675 0.5306 0.65
0.5996 68.0 1700 0.5311 0.65
0.5996 69.0 1725 0.5318 0.65
0.5996 70.0 1750 0.5320 0.65
0.5996 71.0 1775 0.5320 0.65
0.5996 72.0 1800 0.5309 0.65
0.5996 73.0 1825 0.5307 0.65
0.5996 74.0 1850 0.5306 0.65
0.5996 75.0 1875 0.5314 0.65
0.5996 76.0 1900 0.5311 0.65
0.5996 77.0 1925 0.5311 0.65
0.5996 78.0 1950 0.5311 0.65
0.5996 79.0 1975 0.5311 0.65
0.596 80.0 2000 0.5311 0.65

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
8
Unable to determine this model’s pipeline type. Check the docs .

Dataset used to train dkqjrm/20230826130948