Edit model card

20230826022805

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3895
  • Accuracy: 0.72

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.01
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 80.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 25 0.5651 0.49
No log 2.0 50 0.5092 0.6
No log 3.0 75 0.5062 0.63
No log 4.0 100 0.4844 0.64
No log 5.0 125 0.4414 0.58
No log 6.0 150 0.4266 0.58
No log 7.0 175 0.4180 0.62
No log 8.0 200 0.4625 0.64
No log 9.0 225 0.4133 0.61
No log 10.0 250 0.4282 0.63
No log 11.0 275 0.4196 0.57
No log 12.0 300 0.4066 0.6
No log 13.0 325 0.4009 0.62
No log 14.0 350 0.3953 0.63
No log 15.0 375 0.3953 0.61
No log 16.0 400 0.4115 0.64
No log 17.0 425 0.3895 0.6
No log 18.0 450 0.4274 0.63
No log 19.0 475 0.3997 0.64
0.6183 20.0 500 0.3965 0.66
0.6183 21.0 525 0.4352 0.68
0.6183 22.0 550 0.4253 0.69
0.6183 23.0 575 0.3891 0.66
0.6183 24.0 600 0.4324 0.69
0.6183 25.0 625 0.4396 0.73
0.6183 26.0 650 0.4316 0.68
0.6183 27.0 675 0.3951 0.67
0.6183 28.0 700 0.4022 0.68
0.6183 29.0 725 0.4209 0.68
0.6183 30.0 750 0.4500 0.7
0.6183 31.0 775 0.4072 0.71
0.6183 32.0 800 0.4018 0.7
0.6183 33.0 825 0.4191 0.7
0.6183 34.0 850 0.3971 0.71
0.6183 35.0 875 0.3999 0.7
0.6183 36.0 900 0.4025 0.71
0.6183 37.0 925 0.4091 0.71
0.6183 38.0 950 0.4060 0.72
0.6183 39.0 975 0.4416 0.71
0.4716 40.0 1000 0.4041 0.71
0.4716 41.0 1025 0.4100 0.72
0.4716 42.0 1050 0.4042 0.73
0.4716 43.0 1075 0.3744 0.71
0.4716 44.0 1100 0.3827 0.71
0.4716 45.0 1125 0.3941 0.71
0.4716 46.0 1150 0.4305 0.73
0.4716 47.0 1175 0.4008 0.73
0.4716 48.0 1200 0.4027 0.73
0.4716 49.0 1225 0.4024 0.72
0.4716 50.0 1250 0.3938 0.72
0.4716 51.0 1275 0.3843 0.73
0.4716 52.0 1300 0.3911 0.73
0.4716 53.0 1325 0.3855 0.73
0.4716 54.0 1350 0.3934 0.72
0.4716 55.0 1375 0.4029 0.73
0.4716 56.0 1400 0.3878 0.73
0.4716 57.0 1425 0.3839 0.72
0.4716 58.0 1450 0.3943 0.75
0.4716 59.0 1475 0.3984 0.74
0.4121 60.0 1500 0.4064 0.71
0.4121 61.0 1525 0.3871 0.72
0.4121 62.0 1550 0.4141 0.73
0.4121 63.0 1575 0.3850 0.72
0.4121 64.0 1600 0.3933 0.73
0.4121 65.0 1625 0.4055 0.72
0.4121 66.0 1650 0.3852 0.72
0.4121 67.0 1675 0.3952 0.73
0.4121 68.0 1700 0.3874 0.72
0.4121 69.0 1725 0.3999 0.72
0.4121 70.0 1750 0.3956 0.72
0.4121 71.0 1775 0.3918 0.72
0.4121 72.0 1800 0.3859 0.72
0.4121 73.0 1825 0.3926 0.72
0.4121 74.0 1850 0.3897 0.72
0.4121 75.0 1875 0.3859 0.72
0.4121 76.0 1900 0.3849 0.72
0.4121 77.0 1925 0.3856 0.72
0.4121 78.0 1950 0.3902 0.72
0.4121 79.0 1975 0.3904 0.72
0.3881 80.0 2000 0.3895 0.72

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
7
Unable to determine this model’s pipeline type. Check the docs .

Dataset used to train dkqjrm/20230826022805