Edit model card

20230826054840

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4136
  • Accuracy: 0.71

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.02
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 80.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 25 0.6183 0.53
No log 2.0 50 0.4189 0.62
No log 3.0 75 0.4351 0.6
No log 4.0 100 0.4181 0.6
No log 5.0 125 0.4105 0.62
No log 6.0 150 0.4140 0.63
No log 7.0 175 0.4052 0.66
No log 8.0 200 0.4322 0.66
No log 9.0 225 0.4364 0.41
No log 10.0 250 0.4247 0.55
No log 11.0 275 0.4261 0.53
No log 12.0 300 0.4176 0.6
No log 13.0 325 0.4108 0.58
No log 14.0 350 0.4305 0.51
No log 15.0 375 0.4064 0.61
No log 16.0 400 0.4032 0.59
No log 17.0 425 0.4098 0.63
No log 18.0 450 0.4132 0.61
No log 19.0 475 0.3925 0.65
0.7171 20.0 500 0.3957 0.69
0.7171 21.0 525 0.4292 0.64
0.7171 22.0 550 0.4025 0.63
0.7171 23.0 575 0.3997 0.69
0.7171 24.0 600 0.4115 0.62
0.7171 25.0 625 0.4044 0.67
0.7171 26.0 650 0.4098 0.69
0.7171 27.0 675 0.4051 0.65
0.7171 28.0 700 0.4244 0.72
0.7171 29.0 725 0.4032 0.64
0.7171 30.0 750 0.4136 0.7
0.7171 31.0 775 0.3993 0.68
0.7171 32.0 800 0.4170 0.72
0.7171 33.0 825 0.4038 0.71
0.7171 34.0 850 0.4251 0.72
0.7171 35.0 875 0.4079 0.66
0.7171 36.0 900 0.4119 0.71
0.7171 37.0 925 0.4075 0.67
0.7171 38.0 950 0.4406 0.73
0.7171 39.0 975 0.4081 0.72
0.4731 40.0 1000 0.4191 0.67
0.4731 41.0 1025 0.4217 0.68
0.4731 42.0 1050 0.3983 0.73
0.4731 43.0 1075 0.4092 0.66
0.4731 44.0 1100 0.4248 0.69
0.4731 45.0 1125 0.4218 0.68
0.4731 46.0 1150 0.4371 0.7
0.4731 47.0 1175 0.4099 0.69
0.4731 48.0 1200 0.4300 0.69
0.4731 49.0 1225 0.4094 0.72
0.4731 50.0 1250 0.4206 0.71
0.4731 51.0 1275 0.4241 0.72
0.4731 52.0 1300 0.4253 0.66
0.4731 53.0 1325 0.4117 0.66
0.4731 54.0 1350 0.4174 0.67
0.4731 55.0 1375 0.4131 0.67
0.4731 56.0 1400 0.4231 0.67
0.4731 57.0 1425 0.4059 0.7
0.4731 58.0 1450 0.4168 0.72
0.4731 59.0 1475 0.4236 0.68
0.4204 60.0 1500 0.4001 0.68
0.4204 61.0 1525 0.4158 0.71
0.4204 62.0 1550 0.4303 0.68
0.4204 63.0 1575 0.4155 0.65
0.4204 64.0 1600 0.4195 0.66
0.4204 65.0 1625 0.4315 0.67
0.4204 66.0 1650 0.4240 0.71
0.4204 67.0 1675 0.4191 0.68
0.4204 68.0 1700 0.4214 0.71
0.4204 69.0 1725 0.4170 0.71
0.4204 70.0 1750 0.4158 0.68
0.4204 71.0 1775 0.4230 0.69
0.4204 72.0 1800 0.4106 0.69
0.4204 73.0 1825 0.4255 0.68
0.4204 74.0 1850 0.4223 0.67
0.4204 75.0 1875 0.4124 0.7
0.4204 76.0 1900 0.4114 0.7
0.4204 77.0 1925 0.4115 0.71
0.4204 78.0 1950 0.4136 0.71
0.4204 79.0 1975 0.4150 0.71
0.3939 80.0 2000 0.4136 0.71

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
8
Unable to determine this model’s pipeline type. Check the docs .

Dataset used to train dkqjrm/20230826054840