Edit model card

20230826114434

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5403
  • Accuracy: 0.64

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 80.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 25 0.5448 0.65
No log 2.0 50 0.5352 0.66
No log 3.0 75 0.5467 0.65
No log 4.0 100 0.5432 0.65
No log 5.0 125 0.5446 0.66
No log 6.0 150 0.5419 0.63
No log 7.0 175 0.5364 0.66
No log 8.0 200 0.5400 0.66
No log 9.0 225 0.5460 0.66
No log 10.0 250 0.5479 0.66
No log 11.0 275 0.5429 0.66
No log 12.0 300 0.5363 0.66
No log 13.0 325 0.5432 0.66
No log 14.0 350 0.5446 0.63
No log 15.0 375 0.5619 0.65
No log 16.0 400 0.5400 0.66
No log 17.0 425 0.5395 0.66
No log 18.0 450 0.5439 0.66
No log 19.0 475 0.5420 0.66
0.6126 20.0 500 0.5402 0.66
0.6126 21.0 525 0.5431 0.65
0.6126 22.0 550 0.5421 0.62
0.6126 23.0 575 0.5432 0.65
0.6126 24.0 600 0.5438 0.65
0.6126 25.0 625 0.5364 0.64
0.6126 26.0 650 0.5414 0.63
0.6126 27.0 675 0.5395 0.65
0.6126 28.0 700 0.5440 0.65
0.6126 29.0 725 0.5446 0.63
0.6126 30.0 750 0.5472 0.59
0.6126 31.0 775 0.5419 0.65
0.6126 32.0 800 0.5413 0.62
0.6126 33.0 825 0.5530 0.62
0.6126 34.0 850 0.5461 0.62
0.6126 35.0 875 0.5440 0.64
0.6126 36.0 900 0.5437 0.64
0.6126 37.0 925 0.5435 0.63
0.6126 38.0 950 0.5482 0.63
0.6126 39.0 975 0.5449 0.64
0.6037 40.0 1000 0.5442 0.64
0.6037 41.0 1025 0.5377 0.62
0.6037 42.0 1050 0.5411 0.64
0.6037 43.0 1075 0.5482 0.59
0.6037 44.0 1100 0.5494 0.62
0.6037 45.0 1125 0.5510 0.6
0.6037 46.0 1150 0.5472 0.61
0.6037 47.0 1175 0.5416 0.64
0.6037 48.0 1200 0.5397 0.64
0.6037 49.0 1225 0.5417 0.64
0.6037 50.0 1250 0.5390 0.64
0.6037 51.0 1275 0.5389 0.63
0.6037 52.0 1300 0.5366 0.64
0.6037 53.0 1325 0.5368 0.64
0.6037 54.0 1350 0.5393 0.64
0.6037 55.0 1375 0.5378 0.64
0.6037 56.0 1400 0.5391 0.64
0.6037 57.0 1425 0.5383 0.63
0.6037 58.0 1450 0.5379 0.63
0.6037 59.0 1475 0.5381 0.64
0.6021 60.0 1500 0.5410 0.64
0.6021 61.0 1525 0.5401 0.64
0.6021 62.0 1550 0.5403 0.64
0.6021 63.0 1575 0.5411 0.64
0.6021 64.0 1600 0.5415 0.64
0.6021 65.0 1625 0.5415 0.64
0.6021 66.0 1650 0.5409 0.64
0.6021 67.0 1675 0.5419 0.64
0.6021 68.0 1700 0.5401 0.64
0.6021 69.0 1725 0.5424 0.64
0.6021 70.0 1750 0.5420 0.64
0.6021 71.0 1775 0.5415 0.64
0.6021 72.0 1800 0.5391 0.64
0.6021 73.0 1825 0.5396 0.64
0.6021 74.0 1850 0.5396 0.64
0.6021 75.0 1875 0.5405 0.64
0.6021 76.0 1900 0.5404 0.64
0.6021 77.0 1925 0.5400 0.64
0.6021 78.0 1950 0.5401 0.64
0.6021 79.0 1975 0.5403 0.64
0.5946 80.0 2000 0.5403 0.64

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
8
Unable to determine this model’s pipeline type. Check the docs .

Dataset used to train dkqjrm/20230826114434