Edit model card

20230826105341

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4258
  • Accuracy: 0.4

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 80.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 25 0.4625 0.45
No log 2.0 50 0.4859 0.61
No log 3.0 75 0.4227 0.61
No log 4.0 100 0.4247 0.53
No log 5.0 125 0.4481 0.43
No log 6.0 150 0.4310 0.57
No log 7.0 175 0.4267 0.47
No log 8.0 200 0.4246 0.5
No log 9.0 225 0.4267 0.44
No log 10.0 250 0.4260 0.51
No log 11.0 275 0.4226 0.52
No log 12.0 300 0.4271 0.44
No log 13.0 325 0.4266 0.49
No log 14.0 350 0.4244 0.58
No log 15.0 375 0.4253 0.55
No log 16.0 400 0.4256 0.51
No log 17.0 425 0.4265 0.44
No log 18.0 450 0.4261 0.42
No log 19.0 475 0.4262 0.46
1.4009 20.0 500 0.4260 0.47
1.4009 21.0 525 0.4285 0.42
1.4009 22.0 550 0.4260 0.5
1.4009 23.0 575 0.4245 0.54
1.4009 24.0 600 0.4251 0.54
1.4009 25.0 625 0.4271 0.46
1.4009 26.0 650 0.4261 0.46
1.4009 27.0 675 0.4257 0.49
1.4009 28.0 700 0.4255 0.55
1.4009 29.0 725 0.4254 0.52
1.4009 30.0 750 0.4260 0.52
1.4009 31.0 775 0.4256 0.49
1.4009 32.0 800 0.4257 0.55
1.4009 33.0 825 0.4255 0.53
1.4009 34.0 850 0.4256 0.54
1.4009 35.0 875 0.4262 0.44
1.4009 36.0 900 0.4257 0.51
1.4009 37.0 925 0.4267 0.4
1.4009 38.0 950 0.4259 0.48
1.4009 39.0 975 0.4255 0.55
0.9833 40.0 1000 0.4254 0.49
0.9833 41.0 1025 0.4257 0.49
0.9833 42.0 1050 0.4254 0.58
0.9833 43.0 1075 0.4261 0.48
0.9833 44.0 1100 0.4260 0.5
0.9833 45.0 1125 0.4257 0.51
0.9833 46.0 1150 0.4254 0.52
0.9833 47.0 1175 0.4255 0.5
0.9833 48.0 1200 0.4257 0.48
0.9833 49.0 1225 0.4261 0.41
0.9833 50.0 1250 0.4251 0.57
0.9833 51.0 1275 0.4258 0.47
0.9833 52.0 1300 0.4255 0.52
0.9833 53.0 1325 0.4257 0.53
0.9833 54.0 1350 0.4256 0.52
0.9833 55.0 1375 0.4257 0.51
0.9833 56.0 1400 0.4257 0.5
0.9833 57.0 1425 0.4257 0.49
0.9833 58.0 1450 0.4257 0.51
0.9833 59.0 1475 0.4255 0.57
0.7428 60.0 1500 0.4259 0.46
0.7428 61.0 1525 0.4257 0.51
0.7428 62.0 1550 0.4255 0.55
0.7428 63.0 1575 0.4256 0.55
0.7428 64.0 1600 0.4258 0.4
0.7428 65.0 1625 0.4258 0.44
0.7428 66.0 1650 0.4259 0.41
0.7428 67.0 1675 0.4260 0.38
0.7428 68.0 1700 0.4257 0.52
0.7428 69.0 1725 0.4259 0.35
0.7428 70.0 1750 0.4259 0.38
0.7428 71.0 1775 0.4259 0.44
0.7428 72.0 1800 0.4260 0.41
0.7428 73.0 1825 0.4257 0.45
0.7428 74.0 1850 0.4258 0.42
0.7428 75.0 1875 0.4258 0.41
0.7428 76.0 1900 0.4258 0.4
0.7428 77.0 1925 0.4258 0.45
0.7428 78.0 1950 0.4258 0.43
0.7428 79.0 1975 0.4258 0.44
0.6138 80.0 2000 0.4258 0.4

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
8
Unable to determine this model’s pipeline type. Check the docs .

Dataset used to train dkqjrm/20230826105341