Edit model card

20230826040634

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4165
  • Accuracy: 0.67

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.01
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 80.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 25 0.6199 0.4
No log 2.0 50 0.6643 0.59
No log 3.0 75 0.5067 0.54
No log 4.0 100 0.4272 0.63
No log 5.0 125 0.4341 0.49
No log 6.0 150 0.4488 0.44
No log 7.0 175 0.4092 0.69
No log 8.0 200 0.4564 0.61
No log 9.0 225 0.4367 0.6
No log 10.0 250 0.4343 0.65
No log 11.0 275 0.4121 0.66
No log 12.0 300 0.4300 0.64
No log 13.0 325 0.4239 0.67
No log 14.0 350 0.4148 0.65
No log 15.0 375 0.4311 0.67
No log 16.0 400 0.4143 0.62
No log 17.0 425 0.4166 0.65
No log 18.0 450 0.4120 0.63
No log 19.0 475 0.4121 0.63
0.6423 20.0 500 0.4066 0.67
0.6423 21.0 525 0.4047 0.64
0.6423 22.0 550 0.4215 0.63
0.6423 23.0 575 0.4074 0.61
0.6423 24.0 600 0.4068 0.66
0.6423 25.0 625 0.4191 0.61
0.6423 26.0 650 0.4035 0.6
0.6423 27.0 675 0.4228 0.58
0.6423 28.0 700 0.4242 0.66
0.6423 29.0 725 0.4238 0.64
0.6423 30.0 750 0.4788 0.62
0.6423 31.0 775 0.4214 0.64
0.6423 32.0 800 0.4283 0.63
0.6423 33.0 825 0.4222 0.64
0.6423 34.0 850 0.4233 0.66
0.6423 35.0 875 0.4401 0.67
0.6423 36.0 900 0.4584 0.66
0.6423 37.0 925 0.4362 0.68
0.6423 38.0 950 0.3989 0.67
0.6423 39.0 975 0.4379 0.67
0.5234 40.0 1000 0.4094 0.7
0.5234 41.0 1025 0.4683 0.68
0.5234 42.0 1050 0.4360 0.65
0.5234 43.0 1075 0.4382 0.65
0.5234 44.0 1100 0.4057 0.67
0.5234 45.0 1125 0.4300 0.65
0.5234 46.0 1150 0.4253 0.67
0.5234 47.0 1175 0.4346 0.65
0.5234 48.0 1200 0.4167 0.66
0.5234 49.0 1225 0.4572 0.65
0.5234 50.0 1250 0.4413 0.67
0.5234 51.0 1275 0.4160 0.66
0.5234 52.0 1300 0.4044 0.67
0.5234 53.0 1325 0.4246 0.67
0.5234 54.0 1350 0.4075 0.69
0.5234 55.0 1375 0.4202 0.68
0.5234 56.0 1400 0.4382 0.68
0.5234 57.0 1425 0.4282 0.68
0.5234 58.0 1450 0.4145 0.67
0.5234 59.0 1475 0.4202 0.67
0.4334 60.0 1500 0.4233 0.68
0.4334 61.0 1525 0.4285 0.67
0.4334 62.0 1550 0.4272 0.67
0.4334 63.0 1575 0.4233 0.67
0.4334 64.0 1600 0.4339 0.67
0.4334 65.0 1625 0.4171 0.67
0.4334 66.0 1650 0.4095 0.67
0.4334 67.0 1675 0.4198 0.67
0.4334 68.0 1700 0.4170 0.67
0.4334 69.0 1725 0.4264 0.67
0.4334 70.0 1750 0.4363 0.67
0.4334 71.0 1775 0.4206 0.67
0.4334 72.0 1800 0.4197 0.67
0.4334 73.0 1825 0.4302 0.67
0.4334 74.0 1850 0.4257 0.68
0.4334 75.0 1875 0.4187 0.68
0.4334 76.0 1900 0.4252 0.68
0.4334 77.0 1925 0.4272 0.68
0.4334 78.0 1950 0.4203 0.68
0.4334 79.0 1975 0.4160 0.67
0.4063 80.0 2000 0.4165 0.67

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
8
Unable to determine this model’s pipeline type. Check the docs .

Dataset used to train dkqjrm/20230826040634