Edit model card

1_7e-3_5_0.5

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7994
  • Accuracy: 0.7590

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.007
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
2.4468 1.0 590 2.2373 0.6183
2.615 2.0 1180 1.8655 0.5557
2.2782 3.0 1770 1.8976 0.5260
1.7962 4.0 2360 1.7110 0.5746
1.6241 5.0 2950 1.4946 0.6801
1.4269 6.0 3540 1.3572 0.6972
1.4106 7.0 4130 1.3887 0.6394
1.3024 8.0 4720 1.2780 0.6966
1.2769 9.0 5310 1.1492 0.6896
1.1959 10.0 5900 1.4278 0.6936
1.1842 11.0 6490 1.0641 0.7156
1.103 12.0 7080 1.0075 0.7232
1.0823 13.0 7670 1.0099 0.7086
1.0542 14.0 8260 1.0171 0.7294
1.0489 15.0 8850 0.9553 0.7297
1.0048 16.0 9440 0.9329 0.7336
0.9169 17.0 10030 0.9543 0.7321
0.9179 18.0 10620 0.9167 0.7327
0.8928 19.0 11210 0.9433 0.7404
0.8929 20.0 11800 1.0377 0.7346
0.8262 21.0 12390 0.8871 0.7440
0.8508 22.0 12980 0.9002 0.7434
0.8101 23.0 13570 0.8907 0.7471
0.7787 24.0 14160 0.8993 0.7471
0.7706 25.0 14750 0.8341 0.7440
0.7485 26.0 15340 0.8837 0.7376
0.7498 27.0 15930 0.8711 0.7385
0.7175 28.0 16520 0.9197 0.7495
0.7034 29.0 17110 0.8367 0.7434
0.685 30.0 17700 0.8322 0.7459
0.6718 31.0 18290 0.8840 0.7474
0.6746 32.0 18880 0.8978 0.7492
0.6579 33.0 19470 0.8499 0.7456
0.6305 34.0 20060 0.8291 0.7480
0.6316 35.0 20650 0.8555 0.7385
0.6198 36.0 21240 0.8694 0.7557
0.616 37.0 21830 0.8268 0.7599
0.6331 38.0 22420 0.8227 0.7505
0.6077 39.0 23010 0.9053 0.7554
0.5947 40.0 23600 0.9019 0.7554
0.5773 41.0 24190 0.8128 0.7584
0.57 42.0 24780 0.8028 0.7609
0.5686 43.0 25370 0.8444 0.7621
0.564 44.0 25960 0.8285 0.7459
0.5584 45.0 26550 0.8303 0.7544
0.5408 46.0 27140 0.8650 0.7560
0.54 47.0 27730 0.8684 0.7370
0.528 48.0 28320 0.8171 0.7581
0.5499 49.0 28910 0.8792 0.7550
0.5295 50.0 29500 0.8192 0.7578
0.5138 51.0 30090 0.8493 0.7578
0.516 52.0 30680 0.8111 0.7581
0.5066 53.0 31270 0.8026 0.7514
0.5061 54.0 31860 0.8134 0.7609
0.5061 55.0 32450 0.8229 0.7618
0.4903 56.0 33040 0.8253 0.7590
0.4876 57.0 33630 0.8467 0.7596
0.4842 58.0 34220 0.8295 0.7566
0.4743 59.0 34810 0.8587 0.7385
0.484 60.0 35400 0.7973 0.7550
0.4686 61.0 35990 0.8244 0.7593
0.4734 62.0 36580 0.8127 0.7615
0.4655 63.0 37170 0.8271 0.7529
0.457 64.0 37760 0.7995 0.7544
0.4643 65.0 38350 0.8315 0.7642
0.4535 66.0 38940 0.8044 0.7575
0.4445 67.0 39530 0.8785 0.7602
0.4546 68.0 40120 0.7933 0.7587
0.4427 69.0 40710 0.8548 0.7602
0.4441 70.0 41300 0.8274 0.7627
0.4514 71.0 41890 0.7980 0.7495
0.4468 72.0 42480 0.8562 0.7572
0.415 73.0 43070 0.8126 0.7636
0.4225 74.0 43660 0.8120 0.7596
0.4372 75.0 44250 0.8545 0.7602
0.4295 76.0 44840 0.8148 0.7462
0.4351 77.0 45430 0.8043 0.7642
0.4379 78.0 46020 0.7927 0.7621
0.4282 79.0 46610 0.7931 0.7624
0.4169 80.0 47200 0.8081 0.7596
0.4142 81.0 47790 0.8231 0.7602
0.4149 82.0 48380 0.8266 0.7602
0.409 83.0 48970 0.8020 0.7593
0.4084 84.0 49560 0.8396 0.7621
0.4012 85.0 50150 0.8049 0.7606
0.4056 86.0 50740 0.7971 0.7566
0.3991 87.0 51330 0.8462 0.7599
0.4019 88.0 51920 0.8056 0.7569
0.394 89.0 52510 0.8047 0.7554
0.3985 90.0 53100 0.8150 0.7609
0.3978 91.0 53690 0.8178 0.7606
0.4036 92.0 54280 0.7915 0.7560
0.3859 93.0 54870 0.8072 0.7599
0.4053 94.0 55460 0.8112 0.7606
0.3889 95.0 56050 0.8010 0.7587
0.3866 96.0 56640 0.8017 0.7578
0.3806 97.0 57230 0.7965 0.7584
0.3816 98.0 57820 0.7979 0.7590
0.3791 99.0 58410 0.7982 0.7575
0.3782 100.0 59000 0.7994 0.7590

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
11

Dataset used to train Onutoa/1_7e-3_5_0.5