Edit model card

20230826093525

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6263
  • Accuracy: 0.44

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 80.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 25 0.8357 0.4
No log 2.0 50 0.6364 0.62
No log 3.0 75 0.7513 0.62
No log 4.0 100 0.5950 0.6
No log 5.0 125 0.6111 0.49
No log 6.0 150 0.7314 0.59
No log 7.0 175 0.6188 0.67
No log 8.0 200 1.2028 0.58
No log 9.0 225 0.6303 0.71
No log 10.0 250 0.8705 0.65
No log 11.0 275 0.5481 0.68
No log 12.0 300 0.8700 0.7
No log 13.0 325 0.7616 0.62
No log 14.0 350 0.7385 0.71
No log 15.0 375 0.8501 0.55
No log 16.0 400 0.6954 0.49
No log 17.0 425 0.6255 0.55
No log 18.0 450 0.6264 0.38
No log 19.0 475 0.6275 0.42
1.5048 20.0 500 0.6259 0.61
1.5048 21.0 525 0.6270 0.42
1.5048 22.0 550 0.6275 0.42
1.5048 23.0 575 0.6249 0.59
1.5048 24.0 600 0.6269 0.4
1.5048 25.0 625 0.6254 0.57
1.5048 26.0 650 0.6265 0.45
1.5048 27.0 675 0.6262 0.62
1.5048 28.0 700 0.6247 0.54
1.5048 29.0 725 0.6241 0.59
1.5048 30.0 750 0.6247 0.56
1.5048 31.0 775 0.6262 0.5
1.5048 32.0 800 0.6261 0.6
1.5048 33.0 825 0.6261 0.55
1.5048 34.0 850 0.6264 0.44
1.5048 35.0 875 0.6266 0.43
1.5048 36.0 900 0.6265 0.44
1.5048 37.0 925 0.6262 0.47
1.5048 38.0 950 0.6264 0.48
1.5048 39.0 975 0.6264 0.43
1.2203 40.0 1000 0.6262 0.63
1.2203 41.0 1025 0.6263 0.53
1.2203 42.0 1050 0.6262 0.59
1.2203 43.0 1075 0.6265 0.38
1.2203 44.0 1100 0.6262 0.61
1.2203 45.0 1125 0.6262 0.64
1.2203 46.0 1150 0.6263 0.5
1.2203 47.0 1175 0.6262 0.6
1.2203 48.0 1200 0.6263 0.55
1.2203 49.0 1225 0.6265 0.39
1.2203 50.0 1250 0.6262 0.62
1.2203 51.0 1275 0.6262 0.51
1.2203 52.0 1300 0.6261 0.57
1.2203 53.0 1325 0.6262 0.58
1.2203 54.0 1350 0.6261 0.58
1.2203 55.0 1375 0.6260 0.61
1.2203 56.0 1400 0.6261 0.64
1.2203 57.0 1425 0.6263 0.41
1.2203 58.0 1450 0.6264 0.41
1.2203 59.0 1475 0.6263 0.45
0.9516 60.0 1500 0.6263 0.54
0.9516 61.0 1525 0.6263 0.47
0.9516 62.0 1550 0.6261 0.61
0.9516 63.0 1575 0.6263 0.59
0.9516 64.0 1600 0.6261 0.63
0.9516 65.0 1625 0.6263 0.5
0.9516 66.0 1650 0.6265 0.39
0.9516 67.0 1675 0.6262 0.59
0.9516 68.0 1700 0.6264 0.38
0.9516 69.0 1725 0.6262 0.59
0.9516 70.0 1750 0.6263 0.51
0.9516 71.0 1775 0.6261 0.6
0.9516 72.0 1800 0.6263 0.4
0.9516 73.0 1825 0.6262 0.6
0.9516 74.0 1850 0.6263 0.48
0.9516 75.0 1875 0.6262 0.62
0.9516 76.0 1900 0.6263 0.44
0.9516 77.0 1925 0.6263 0.43
0.9516 78.0 1950 0.6263 0.45
0.9516 79.0 1975 0.6263 0.42
0.7734 80.0 2000 0.6263 0.44

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
8
Unable to determine this model’s pipeline type. Check the docs .

Dataset used to train dkqjrm/20230826093525