Edit model card

20230830203443

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7274
  • Accuracy: 0.5

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0007
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 80.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 340 0.7532 0.5
0.7604 2.0 680 0.7283 0.5
0.7635 3.0 1020 0.7745 0.5
0.7635 4.0 1360 0.8267 0.5
0.7685 5.0 1700 0.7674 0.5
0.7536 6.0 2040 0.7283 0.5
0.7536 7.0 2380 0.7315 0.5
0.7457 8.0 2720 0.7962 0.5
0.7462 9.0 3060 0.7287 0.5
0.7462 10.0 3400 0.7515 0.5
0.7445 11.0 3740 0.7305 0.5
0.7427 12.0 4080 0.7298 0.5
0.7427 13.0 4420 0.8376 0.5
0.7506 14.0 4760 0.7391 0.4749
0.7508 15.0 5100 0.7457 0.5
0.7508 16.0 5440 0.7366 0.5
0.7428 17.0 5780 0.7423 0.5
0.7418 18.0 6120 0.7331 0.5
0.7418 19.0 6460 0.7340 0.5
0.7443 20.0 6800 0.7566 0.5
0.7411 21.0 7140 0.7274 0.5
0.7411 22.0 7480 0.7503 0.5
0.7423 23.0 7820 0.7416 0.5
0.7428 24.0 8160 0.7274 0.5
0.7406 25.0 8500 0.7313 0.5
0.7406 26.0 8840 0.7513 0.5
0.7421 27.0 9180 0.7476 0.5
0.7423 28.0 9520 0.7274 0.5
0.7423 29.0 9860 0.7313 0.5
0.7381 30.0 10200 0.7274 0.5
0.739 31.0 10540 0.7276 0.5
0.739 32.0 10880 0.7727 0.5
0.7392 33.0 11220 0.7287 0.5
0.7389 34.0 11560 0.7376 0.5
0.7389 35.0 11900 0.7278 0.5
0.7391 36.0 12240 0.7296 0.5
0.7369 37.0 12580 0.7307 0.5
0.7369 38.0 12920 0.7304 0.5
0.7391 39.0 13260 0.7358 0.5
0.7366 40.0 13600 0.7298 0.5
0.7366 41.0 13940 0.7284 0.5
0.737 42.0 14280 0.7279 0.5
0.7343 43.0 14620 0.7334 0.5
0.7343 44.0 14960 0.7273 0.5
0.7358 45.0 15300 0.7468 0.5
0.7341 46.0 15640 0.7277 0.5
0.7341 47.0 15980 0.7327 0.5
0.7345 48.0 16320 0.7290 0.5
0.7357 49.0 16660 0.7518 0.5
0.7362 50.0 17000 0.7276 0.5
0.7362 51.0 17340 0.7275 0.5
0.7313 52.0 17680 0.7279 0.5
0.7357 53.0 18020 0.7307 0.5
0.7357 54.0 18360 0.7276 0.5
0.7323 55.0 18700 0.7294 0.5
0.7304 56.0 19040 0.7310 0.5
0.7304 57.0 19380 0.7278 0.5
0.7326 58.0 19720 0.7289 0.5
0.7314 59.0 20060 0.7461 0.5
0.7314 60.0 20400 0.7287 0.5
0.7319 61.0 20740 0.7337 0.5
0.7304 62.0 21080 0.7273 0.5
0.7304 63.0 21420 0.7288 0.5
0.7313 64.0 21760 0.7285 0.5
0.7317 65.0 22100 0.7285 0.5
0.7317 66.0 22440 0.7310 0.5
0.7294 67.0 22780 0.7274 0.5
0.7304 68.0 23120 0.7275 0.5
0.7304 69.0 23460 0.7281 0.5
0.7286 70.0 23800 0.7276 0.5
0.7295 71.0 24140 0.7277 0.5
0.7295 72.0 24480 0.7301 0.5
0.7292 73.0 24820 0.7277 0.5
0.7288 74.0 25160 0.7302 0.5
0.7276 75.0 25500 0.7280 0.5
0.7276 76.0 25840 0.7275 0.5
0.7281 77.0 26180 0.7274 0.5
0.727 78.0 26520 0.7275 0.5
0.727 79.0 26860 0.7275 0.5
0.7279 80.0 27200 0.7274 0.5

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
10

Dataset used to train dkqjrm/20230830203443