Edit model card

20230826092050

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4268
  • Accuracy: 0.37

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 80.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 25 0.5239 0.61
No log 2.0 50 0.4231 0.45
No log 3.0 75 0.4342 0.48
No log 4.0 100 0.4309 0.43
No log 5.0 125 0.4262 0.58
No log 6.0 150 0.4267 0.49
No log 7.0 175 0.4263 0.61
No log 8.0 200 0.4268 0.49
No log 9.0 225 0.4267 0.56
No log 10.0 250 0.4268 0.51
No log 11.0 275 0.4275 0.4
No log 12.0 300 0.4269 0.46
No log 13.0 325 0.4267 0.62
No log 14.0 350 0.4267 0.55
No log 15.0 375 0.4268 0.42
No log 16.0 400 0.4268 0.45
No log 17.0 425 0.4270 0.44
No log 18.0 450 0.4267 0.6
No log 19.0 475 0.4268 0.61
1.2569 20.0 500 0.4268 0.38
1.2569 21.0 525 0.4268 0.57
1.2569 22.0 550 0.4267 0.61
1.2569 23.0 575 0.4267 0.59
1.2569 24.0 600 0.4267 0.54
1.2569 25.0 625 0.4268 0.53
1.2569 26.0 650 0.4268 0.38
1.2569 27.0 675 0.4267 0.61
1.2569 28.0 700 0.4268 0.43
1.2569 29.0 725 0.4268 0.61
1.2569 30.0 750 0.4268 0.43
1.2569 31.0 775 0.4268 0.43
1.2569 32.0 800 0.4268 0.54
1.2569 33.0 825 0.4268 0.47
1.2569 34.0 850 0.4268 0.43
1.2569 35.0 875 0.4268 0.43
1.2569 36.0 900 0.4268 0.64
1.2569 37.0 925 0.4268 0.45
1.2569 38.0 950 0.4268 0.43
1.2569 39.0 975 0.4268 0.41
0.9505 40.0 1000 0.4267 0.58
0.9505 41.0 1025 0.4267 0.59
0.9505 42.0 1050 0.4268 0.56
0.9505 43.0 1075 0.4268 0.43
0.9505 44.0 1100 0.4268 0.49
0.9505 45.0 1125 0.4268 0.58
0.9505 46.0 1150 0.4267 0.59
0.9505 47.0 1175 0.4267 0.6
0.9505 48.0 1200 0.4267 0.63
0.9505 49.0 1225 0.4268 0.44
0.9505 50.0 1250 0.4268 0.52
0.9505 51.0 1275 0.4268 0.4
0.9505 52.0 1300 0.4268 0.46
0.9505 53.0 1325 0.4268 0.47
0.9505 54.0 1350 0.4268 0.51
0.9505 55.0 1375 0.4268 0.44
0.9505 56.0 1400 0.4268 0.55
0.9505 57.0 1425 0.4267 0.54
0.9505 58.0 1450 0.4267 0.55
0.9505 59.0 1475 0.4267 0.54
0.7437 60.0 1500 0.4267 0.58
0.7437 61.0 1525 0.4268 0.57
0.7437 62.0 1550 0.4268 0.42
0.7437 63.0 1575 0.4268 0.41
0.7437 64.0 1600 0.4268 0.44
0.7437 65.0 1625 0.4268 0.47
0.7437 66.0 1650 0.4268 0.41
0.7437 67.0 1675 0.4268 0.54
0.7437 68.0 1700 0.4268 0.4
0.7437 69.0 1725 0.4268 0.41
0.7437 70.0 1750 0.4268 0.4
0.7437 71.0 1775 0.4268 0.41
0.7437 72.0 1800 0.4268 0.42
0.7437 73.0 1825 0.4268 0.43
0.7437 74.0 1850 0.4268 0.41
0.7437 75.0 1875 0.4268 0.41
0.7437 76.0 1900 0.4268 0.4
0.7437 77.0 1925 0.4268 0.4
0.7437 78.0 1950 0.4268 0.41
0.7437 79.0 1975 0.4268 0.38
0.6146 80.0 2000 0.4268 0.37

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
8
Unable to determine this model’s pipeline type. Check the docs .

Dataset used to train dkqjrm/20230826092050