20230903070300 / README.md
dkqjrm's picture
update model card README.md
130f22d
metadata
license: apache-2.0
tags:
  - generated_from_trainer
datasets:
  - super_glue
metrics:
  - accuracy
model-index:
  - name: '20230903070300'
    results: []

20230903070300

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8203
  • Accuracy: 0.6599

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 80.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 340 0.7251 0.5063
0.7449 2.0 680 0.7348 0.5
0.7388 3.0 1020 0.7304 0.5
0.7388 4.0 1360 0.7639 0.5
0.7384 5.0 1700 0.7316 0.5
0.7376 6.0 2040 0.7268 0.5
0.7376 7.0 2380 0.7263 0.5
0.7328 8.0 2720 0.7333 0.5
0.7266 9.0 3060 0.7533 0.5
0.7266 10.0 3400 0.7247 0.4984
0.7293 11.0 3740 0.7290 0.5172
0.7248 12.0 4080 0.7539 0.5
0.7248 13.0 4420 0.7395 0.5
0.7255 14.0 4760 0.7360 0.5031
0.7271 15.0 5100 0.7278 0.5
0.7271 16.0 5440 0.7314 0.5094
0.7265 17.0 5780 0.7417 0.4984
0.724 18.0 6120 0.7263 0.5
0.724 19.0 6460 0.7272 0.5031
0.723 20.0 6800 0.7283 0.5172
0.7254 21.0 7140 0.7284 0.5047
0.7254 22.0 7480 0.7346 0.4984
0.7254 23.0 7820 0.7295 0.5125
0.7259 24.0 8160 0.7322 0.5047
0.7235 25.0 8500 0.7327 0.5172
0.7235 26.0 8840 0.7300 0.5172
0.7241 27.0 9180 0.7345 0.5016
0.7227 28.0 9520 0.7263 0.5172
0.7227 29.0 9860 0.7341 0.5016
0.7212 30.0 10200 0.7302 0.5125
0.7226 31.0 10540 0.7346 0.5078
0.7226 32.0 10880 0.7606 0.4702
0.7195 33.0 11220 0.7357 0.5063
0.7226 34.0 11560 0.7356 0.5031
0.7226 35.0 11900 0.7397 0.5063
0.7224 36.0 12240 0.7340 0.5157
0.7216 37.0 12580 0.7319 0.5047
0.7216 38.0 12920 0.7298 0.5141
0.7225 39.0 13260 0.7438 0.5016
0.7197 40.0 13600 0.7306 0.5047
0.7197 41.0 13940 0.7279 0.5125
0.7206 42.0 14280 0.7181 0.5502
0.7079 43.0 14620 0.7566 0.5862
0.7079 44.0 14960 0.7480 0.6254
0.6794 45.0 15300 0.6922 0.6630
0.6556 46.0 15640 0.7232 0.6223
0.6556 47.0 15980 0.6961 0.6458
0.6438 48.0 16320 0.7193 0.6458
0.6249 49.0 16660 0.6663 0.6693
0.6117 50.0 17000 0.8045 0.6191
0.6117 51.0 17340 0.6984 0.6630
0.5961 52.0 17680 0.6973 0.6646
0.5831 53.0 18020 0.7606 0.6348
0.5831 54.0 18360 0.7159 0.6614
0.5624 55.0 18700 0.7947 0.6426
0.558 56.0 19040 0.8629 0.6238
0.558 57.0 19380 0.7299 0.6646
0.5461 58.0 19720 0.7642 0.6411
0.5322 59.0 20060 0.7357 0.6661
0.5322 60.0 20400 0.8926 0.6191
0.5253 61.0 20740 0.7845 0.6348
0.5193 62.0 21080 0.7580 0.6614
0.5193 63.0 21420 0.7705 0.6505
0.5169 64.0 21760 0.8464 0.6458
0.5021 65.0 22100 0.8002 0.6536
0.5021 66.0 22440 0.7595 0.6677
0.487 67.0 22780 0.7971 0.6458
0.4977 68.0 23120 0.8245 0.6270
0.4977 69.0 23460 0.8225 0.6379
0.4822 70.0 23800 0.8323 0.6364
0.4802 71.0 24140 0.8205 0.6364
0.4802 72.0 24480 0.8086 0.6520
0.4779 73.0 24820 0.7994 0.6567
0.4801 74.0 25160 0.8206 0.6520
0.4706 75.0 25500 0.8035 0.6442
0.4706 76.0 25840 0.8213 0.6364
0.4738 77.0 26180 0.8128 0.6630
0.4687 78.0 26520 0.8068 0.6567
0.4687 79.0 26860 0.8098 0.6630
0.4598 80.0 27200 0.8203 0.6599

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3