Edit model card

20230903015507

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8747
  • Accuracy: 0.6505

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 80.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 340 0.6715 0.5172
0.6923 2.0 680 0.6802 0.5
0.6863 3.0 1020 0.6721 0.5
0.6863 4.0 1360 0.7046 0.5
0.6843 5.0 1700 0.6757 0.5
0.6885 6.0 2040 0.6788 0.5
0.6885 7.0 2380 0.6702 0.5
0.686 8.0 2720 0.6763 0.5
0.6858 9.0 3060 0.6777 0.5
0.6858 10.0 3400 0.6804 0.5
0.6868 11.0 3740 0.6711 0.5
0.6817 12.0 4080 0.6777 0.5
0.6817 13.0 4420 0.6960 0.5
0.6805 14.0 4760 0.6901 0.5
0.6823 15.0 5100 0.6715 0.5
0.6823 16.0 5440 0.6738 0.5016
0.6776 17.0 5780 0.6813 0.5
0.676 18.0 6120 0.6718 0.5
0.676 19.0 6460 0.6727 0.5
0.6762 20.0 6800 0.6742 0.4984
0.6748 21.0 7140 0.6699 0.5282
0.6748 22.0 7480 0.6624 0.5141
0.6749 23.0 7820 0.7549 0.5705
0.6441 24.0 8160 0.6447 0.6238
0.6189 25.0 8500 0.6692 0.6113
0.6189 26.0 8840 0.6171 0.6771
0.582 27.0 9180 0.7757 0.5831
0.5622 28.0 9520 0.8074 0.6050
0.5622 29.0 9860 0.6636 0.6614
0.5303 30.0 10200 0.7353 0.6458
0.5188 31.0 10540 0.6546 0.6536
0.5188 32.0 10880 0.8451 0.6082
0.5007 33.0 11220 0.7618 0.6442
0.4847 34.0 11560 0.6832 0.6583
0.4847 35.0 11900 0.7070 0.6442
0.4719 36.0 12240 0.6991 0.6536
0.4523 37.0 12580 0.7525 0.6661
0.4523 38.0 12920 0.7912 0.6348
0.4447 39.0 13260 0.7760 0.6536
0.439 40.0 13600 0.8018 0.6458
0.439 41.0 13940 0.7104 0.6708
0.4248 42.0 14280 0.7607 0.6599
0.4063 43.0 14620 0.6979 0.6803
0.4063 44.0 14960 0.7796 0.6614
0.4123 45.0 15300 0.7394 0.6708
0.3984 46.0 15640 0.7791 0.6599
0.3984 47.0 15980 0.7433 0.6614
0.3871 48.0 16320 0.7870 0.6442
0.3787 49.0 16660 0.7256 0.6755
0.3884 50.0 17000 0.8035 0.6536
0.3884 51.0 17340 0.7809 0.6489
0.373 52.0 17680 0.7920 0.6567
0.3704 53.0 18020 0.8107 0.6661
0.3704 54.0 18360 0.8759 0.6113
0.3628 55.0 18700 0.8727 0.6332
0.3518 56.0 19040 0.8756 0.6254
0.3518 57.0 19380 0.8555 0.6317
0.3536 58.0 19720 0.8082 0.6254
0.3504 59.0 20060 0.7880 0.6614
0.3504 60.0 20400 0.9100 0.6301
0.3466 61.0 20740 0.8614 0.6207
0.3425 62.0 21080 0.8712 0.6301
0.3425 63.0 21420 0.8285 0.6614
0.339 64.0 21760 0.9010 0.6599
0.3339 65.0 22100 0.9055 0.6426
0.3339 66.0 22440 0.8365 0.6646
0.3294 67.0 22780 0.8333 0.6505
0.3365 68.0 23120 0.8414 0.6426
0.3365 69.0 23460 0.8855 0.6395
0.332 70.0 23800 0.9028 0.6364
0.3171 71.0 24140 0.8584 0.6364
0.3171 72.0 24480 0.8482 0.6536
0.3204 73.0 24820 0.8713 0.6426
0.3289 74.0 25160 0.8881 0.6473
0.3139 75.0 25500 0.8588 0.6473
0.3139 76.0 25840 0.8772 0.6473
0.3159 77.0 26180 0.9019 0.6536
0.306 78.0 26520 0.8819 0.6505
0.306 79.0 26860 0.8837 0.6473
0.3091 80.0 27200 0.8747 0.6505

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
11

Dataset used to train dkqjrm/20230903015507