Edit model card

1_7e-3_1_0.1

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9192
  • Accuracy: 0.7364

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.007
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
1.1257 1.0 590 1.4503 0.3786
1.0981 2.0 1180 1.8937 0.3783
0.9703 3.0 1770 2.1720 0.3783
0.979 4.0 2360 1.2857 0.3783
0.9202 5.0 2950 0.8853 0.6217
0.8741 6.0 3540 0.7896 0.4083
0.8447 7.0 4130 0.6795 0.6226
0.7643 8.0 4720 0.7802 0.6446
0.7719 9.0 5310 1.1456 0.6217
0.801 10.0 5900 1.0800 0.6217
0.754 11.0 6490 0.5603 0.6826
0.7759 12.0 7080 0.8186 0.6462
0.7224 13.0 7670 0.5755 0.6841
0.7168 14.0 8260 1.2988 0.6229
0.716 15.0 8850 0.7171 0.6153
0.6926 16.0 9440 0.7309 0.5960
0.6362 17.0 10030 0.9837 0.5373
0.6371 18.0 10620 0.5982 0.6994
0.61 19.0 11210 0.6893 0.6914
0.6424 20.0 11800 0.7196 0.6398
0.5784 21.0 12390 0.7087 0.6410
0.5773 22.0 12980 0.8326 0.6147
0.5295 23.0 13570 0.7017 0.6538
0.5214 24.0 14160 0.5868 0.7278
0.527 25.0 14750 0.7043 0.7135
0.5021 26.0 15340 0.7474 0.6679
0.4886 27.0 15930 0.6396 0.7217
0.4509 28.0 16520 0.6120 0.7128
0.4456 29.0 17110 0.6018 0.7287
0.4645 30.0 17700 0.7315 0.7318
0.4423 31.0 18290 0.7017 0.7131
0.4267 32.0 18880 0.6643 0.7358
0.4184 33.0 19470 0.6844 0.7040
0.3699 34.0 20060 0.6415 0.7443
0.3789 35.0 20650 0.6348 0.7453
0.3587 36.0 21240 0.7050 0.7352
0.3696 37.0 21830 0.7278 0.7254
0.3328 38.0 22420 0.7659 0.7150
0.326 39.0 23010 0.7224 0.7174
0.3122 40.0 23600 0.8104 0.7006
0.313 41.0 24190 0.7368 0.7398
0.2977 42.0 24780 0.7630 0.7303
0.2831 43.0 25370 0.7945 0.7450
0.2875 44.0 25960 0.9180 0.6869
0.2603 45.0 26550 0.8352 0.7413
0.2681 46.0 27140 0.7725 0.7410
0.2576 47.0 27730 0.7956 0.7235
0.2548 48.0 28320 0.9659 0.6945
0.2692 49.0 28910 0.7937 0.7193
0.2432 50.0 29500 0.8977 0.7361
0.2501 51.0 30090 0.8996 0.7061
0.2248 52.0 30680 0.7954 0.7453
0.2115 53.0 31270 0.8856 0.7450
0.2238 54.0 31860 0.9348 0.7095
0.2244 55.0 32450 0.9433 0.7180
0.2178 56.0 33040 0.8618 0.7229
0.199 57.0 33630 0.8852 0.7125
0.2022 58.0 34220 0.8382 0.7401
0.1903 59.0 34810 0.9755 0.7281
0.1951 60.0 35400 0.9614 0.7269
0.1782 61.0 35990 0.9009 0.7373
0.1792 62.0 36580 0.9366 0.7373
0.1751 63.0 37170 0.9252 0.7327
0.1743 64.0 37760 0.9330 0.7385
0.167 65.0 38350 0.9401 0.7336
0.1607 66.0 38940 0.9794 0.7144
0.1664 67.0 39530 0.8882 0.7407
0.1523 68.0 40120 0.9037 0.7352
0.1563 69.0 40710 0.9848 0.7450
0.1586 70.0 41300 0.9774 0.7379
0.1518 71.0 41890 1.0685 0.7064
0.1562 72.0 42480 0.8568 0.7440
0.1358 73.0 43070 0.9193 0.7339
0.1449 74.0 43660 0.9681 0.7398
0.1374 75.0 44250 0.9315 0.7294
0.1426 76.0 44840 0.9200 0.7349
0.1385 77.0 45430 1.0396 0.7138
0.1358 78.0 46020 0.8802 0.7346
0.1262 79.0 46610 0.8972 0.7404
0.1285 80.0 47200 0.9058 0.7388
0.1305 81.0 47790 0.9642 0.7300
0.1288 82.0 48380 0.9402 0.7450
0.1229 83.0 48970 0.9702 0.7336
0.1192 84.0 49560 0.9303 0.7379
0.1271 85.0 50150 0.9670 0.7187
0.12 86.0 50740 0.9493 0.7410
0.1174 87.0 51330 0.9612 0.7260
0.1174 88.0 51920 0.9380 0.7349
0.1061 89.0 52510 0.9493 0.7309
0.1137 90.0 53100 0.9143 0.7318
0.1107 91.0 53690 0.8931 0.7343
0.107 92.0 54280 0.9345 0.7266
0.1079 93.0 54870 0.9181 0.7379
0.1107 94.0 55460 0.8965 0.7394
0.1059 95.0 56050 0.9065 0.7385
0.0994 96.0 56640 0.8962 0.7404
0.1116 97.0 57230 0.9197 0.7376
0.1061 98.0 57820 0.9145 0.7370
0.1038 99.0 58410 0.9207 0.7349
0.1002 100.0 59000 0.9192 0.7364

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
11

Dataset used to train Onutoa/1_7e-3_1_0.1