Edit model card

20230830190813

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7333
  • Accuracy: 0.5141

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 80.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 340 0.7313 0.5204
0.7523 2.0 680 0.7285 0.5
0.7461 3.0 1020 0.7229 0.5063
0.7461 4.0 1360 0.7062 0.5784
0.7318 5.0 1700 0.7796 0.6034
0.7057 6.0 2040 0.8194 0.5831
0.7057 7.0 2380 0.7297 0.5
0.7178 8.0 2720 0.7423 0.5
0.7417 9.0 3060 0.7280 0.5
0.7417 10.0 3400 0.7606 0.5016
0.7399 11.0 3740 0.7346 0.5172
0.7334 12.0 4080 0.7411 0.5
0.7334 13.0 4420 0.7588 0.5
0.7332 14.0 4760 0.7427 0.4718
0.7345 15.0 5100 0.7317 0.5047
0.7345 16.0 5440 0.7394 0.5031
0.7308 17.0 5780 0.7445 0.5
0.7295 18.0 6120 0.7517 0.4718
0.7295 19.0 6460 0.7323 0.5016
0.728 20.0 6800 0.7320 0.5157
0.73 21.0 7140 0.7309 0.5172
0.73 22.0 7480 0.7434 0.4984
0.7304 23.0 7820 0.7366 0.5094
0.7298 24.0 8160 0.7334 0.5
0.7283 25.0 8500 0.7342 0.5125
0.7283 26.0 8840 0.7311 0.5047
0.7291 27.0 9180 0.7565 0.4702
0.7292 28.0 9520 0.7282 0.5031
0.7292 29.0 9860 0.7333 0.5016
0.7261 30.0 10200 0.7328 0.5125
0.7279 31.0 10540 0.7349 0.5125
0.7279 32.0 10880 0.7592 0.4702
0.7252 33.0 11220 0.7393 0.5094
0.7263 34.0 11560 0.7394 0.5047
0.7263 35.0 11900 0.7465 0.5016
0.7269 36.0 12240 0.7349 0.5141
0.7263 37.0 12580 0.7295 0.5047
0.7263 38.0 12920 0.7329 0.5172
0.728 39.0 13260 0.7401 0.5
0.7254 40.0 13600 0.7331 0.5157
0.7254 41.0 13940 0.7308 0.5172
0.7265 42.0 14280 0.7312 0.5172
0.7234 43.0 14620 0.7393 0.5
0.7234 44.0 14960 0.7392 0.5
0.7254 45.0 15300 0.7389 0.5
0.7225 46.0 15640 0.7312 0.5157
0.7225 47.0 15980 0.7335 0.5
0.7268 48.0 16320 0.7363 0.5016
0.7258 49.0 16660 0.7393 0.5031
0.7253 50.0 17000 0.7306 0.5047
0.7253 51.0 17340 0.7372 0.5094
0.7247 52.0 17680 0.7402 0.5
0.7248 53.0 18020 0.7355 0.5141
0.7248 54.0 18360 0.7369 0.5157
0.7237 55.0 18700 0.7320 0.5141
0.7226 56.0 19040 0.7366 0.5172
0.7226 57.0 19380 0.7315 0.5172
0.7238 58.0 19720 0.7388 0.5016
0.7228 59.0 20060 0.7347 0.5047
0.7228 60.0 20400 0.7313 0.5141
0.7245 61.0 20740 0.7330 0.5141
0.7222 62.0 21080 0.7350 0.5141
0.7222 63.0 21420 0.7314 0.5157
0.724 64.0 21760 0.7327 0.5141
0.7236 65.0 22100 0.7306 0.5172
0.7236 66.0 22440 0.7351 0.5141
0.7205 67.0 22780 0.7343 0.5125
0.7236 68.0 23120 0.7313 0.5157
0.7236 69.0 23460 0.7338 0.5172
0.7221 70.0 23800 0.7317 0.5157
0.7226 71.0 24140 0.7344 0.5141
0.7226 72.0 24480 0.7342 0.5157
0.7209 73.0 24820 0.7333 0.5157
0.7229 74.0 25160 0.7358 0.5141
0.7204 75.0 25500 0.7342 0.5157
0.7204 76.0 25840 0.7329 0.5157
0.7213 77.0 26180 0.7334 0.5141
0.7208 78.0 26520 0.7335 0.5141
0.7208 79.0 26860 0.7330 0.5141
0.7203 80.0 27200 0.7333 0.5141

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
14

Dataset used to train dkqjrm/20230830190813