Edit model card

20230903121524

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9097
  • Accuracy: 0.6442

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 80.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 340 0.7286 0.5
0.7482 2.0 680 0.7273 0.5
0.7442 3.0 1020 0.7313 0.5
0.7442 4.0 1360 0.7599 0.5
0.7355 5.0 1700 0.7222 0.6113
0.6979 6.0 2040 0.7373 0.6160
0.6979 7.0 2380 0.6950 0.6583
0.6629 8.0 2720 0.6711 0.6740
0.6282 9.0 3060 0.7543 0.6599
0.6282 10.0 3400 0.7217 0.6520
0.6023 11.0 3740 0.7513 0.6426
0.5705 12.0 4080 0.6886 0.6693
0.5705 13.0 4420 0.6779 0.6755
0.5607 14.0 4760 0.7978 0.6489
0.527 15.0 5100 0.6722 0.6771
0.527 16.0 5440 0.8047 0.6317
0.5226 17.0 5780 0.7721 0.6740
0.5133 18.0 6120 0.7900 0.6552
0.5133 19.0 6460 0.7563 0.6599
0.5054 20.0 6800 0.8456 0.6411
0.4836 21.0 7140 0.8232 0.6426
0.4836 22.0 7480 0.7993 0.6270
0.4796 23.0 7820 0.8026 0.6426
0.4659 24.0 8160 0.8306 0.6254
0.4669 25.0 8500 0.8153 0.6505
0.4669 26.0 8840 0.8499 0.6489
0.4487 27.0 9180 0.8366 0.6332
0.4499 28.0 9520 0.7661 0.6567
0.4499 29.0 9860 0.7668 0.6630
0.4483 30.0 10200 0.8147 0.6520
0.4303 31.0 10540 0.8030 0.6442
0.4303 32.0 10880 0.8346 0.6285
0.4272 33.0 11220 0.7779 0.6489
0.43 34.0 11560 0.8193 0.6599
0.43 35.0 11900 0.8792 0.6411
0.4139 36.0 12240 0.8091 0.6332
0.4139 37.0 12580 0.7939 0.6458
0.4139 38.0 12920 0.8626 0.6505
0.4102 39.0 13260 0.8111 0.6442
0.4065 40.0 13600 0.8054 0.6583
0.4065 41.0 13940 0.8704 0.6520
0.4049 42.0 14280 0.8441 0.6348
0.3978 43.0 14620 0.8723 0.6411
0.3978 44.0 14960 0.8747 0.6552
0.4074 45.0 15300 0.8662 0.6505
0.3952 46.0 15640 0.8432 0.6442
0.3952 47.0 15980 0.8837 0.6552
0.3868 48.0 16320 0.8219 0.6583
0.3805 49.0 16660 0.7792 0.6536
0.386 50.0 17000 0.8385 0.6520
0.386 51.0 17340 0.8554 0.6505
0.3869 52.0 17680 0.8655 0.6583
0.3772 53.0 18020 0.8613 0.6552
0.3772 54.0 18360 0.9268 0.6364
0.3744 55.0 18700 0.8710 0.6473
0.378 56.0 19040 0.9222 0.6395
0.378 57.0 19380 0.8803 0.6536
0.3702 58.0 19720 0.9055 0.6364
0.3687 59.0 20060 0.8305 0.6630
0.3687 60.0 20400 0.9229 0.6395
0.3677 61.0 20740 0.9214 0.6301
0.3635 62.0 21080 0.9074 0.6458
0.3635 63.0 21420 0.8890 0.6520
0.3613 64.0 21760 0.8725 0.6426
0.3634 65.0 22100 0.8860 0.6489
0.3634 66.0 22440 0.8428 0.6614
0.3528 67.0 22780 0.8792 0.6458
0.3613 68.0 23120 0.8840 0.6254
0.3613 69.0 23460 0.8960 0.6489
0.3516 70.0 23800 0.8763 0.6567
0.348 71.0 24140 0.8935 0.6332
0.348 72.0 24480 0.9031 0.6442
0.3567 73.0 24820 0.9070 0.6458
0.3514 74.0 25160 0.8997 0.6426
0.3543 75.0 25500 0.9025 0.6458
0.3543 76.0 25840 0.9028 0.6379
0.3457 77.0 26180 0.9155 0.6364
0.3452 78.0 26520 0.8973 0.6426
0.3452 79.0 26860 0.9085 0.6458
0.3379 80.0 27200 0.9097 0.6442

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
11
Inference API
This model can be loaded on Inference API (serverless).

Dataset used to train dkqjrm/20230903121524