Edit model card

1_7e-3_5_0.9

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8502
  • Accuracy: 0.7459

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.007
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
3.7211 1.0 590 2.9812 0.6211
3.5848 2.0 1180 2.9510 0.4997
3.6107 3.0 1770 2.7782 0.5235
3.0177 4.0 2360 2.2450 0.6483
2.8029 5.0 2950 2.7490 0.6462
2.6357 6.0 3540 1.8031 0.6554
2.6215 7.0 4130 2.3281 0.5838
2.329 8.0 4720 2.0869 0.6862
2.2143 9.0 5310 1.9625 0.6257
2.2128 10.0 5900 2.0803 0.6859
2.0857 11.0 6490 1.4649 0.6972
1.9328 12.0 7080 1.8434 0.6945
1.8594 13.0 7670 1.4225 0.6765
1.9315 14.0 8260 1.5322 0.7156
1.9249 15.0 8850 1.4720 0.7162
1.7274 16.0 9440 1.6171 0.6547
1.5474 17.0 10030 1.1592 0.7153
1.5032 18.0 10620 1.3276 0.7205
1.5738 19.0 11210 1.4631 0.6786
1.6749 20.0 11800 1.9620 0.6266
1.4133 21.0 12390 1.0952 0.7245
1.3552 22.0 12980 1.2053 0.7015
1.4104 23.0 13570 1.2010 0.7110
1.3108 24.0 14160 1.0470 0.7309
1.3339 25.0 14750 1.4671 0.7333
1.2143 26.0 15340 1.2387 0.6963
1.2473 27.0 15930 1.2540 0.7355
1.2602 28.0 16520 1.0843 0.7205
1.1832 29.0 17110 1.4378 0.6795
1.0999 30.0 17700 1.6722 0.7321
1.0803 31.0 18290 1.8755 0.7131
1.1358 32.0 18880 0.9925 0.7428
1.0867 33.0 19470 1.1163 0.7450
1.0661 34.0 20060 1.1009 0.7483
1.0572 35.0 20650 0.9747 0.7306
0.987 36.0 21240 1.1560 0.7440
1.0077 37.0 21830 1.0074 0.7086
0.9957 38.0 22420 0.9483 0.7291
0.9444 39.0 23010 1.0395 0.7248
0.9516 40.0 23600 1.0121 0.7315
0.9195 41.0 24190 0.9376 0.7398
0.9188 42.0 24780 1.1039 0.7135
0.9049 43.0 25370 1.2491 0.7391
0.9134 44.0 25960 0.9002 0.7346
0.8631 45.0 26550 1.1289 0.7419
0.8403 46.0 27140 1.0339 0.7416
0.8611 47.0 27730 1.2419 0.7443
0.84 48.0 28320 0.8991 0.7401
0.8795 49.0 28910 0.9157 0.7361
0.8211 50.0 29500 1.0039 0.7223
0.8124 51.0 30090 1.1785 0.7104
0.79 52.0 30680 0.9678 0.7385
0.7861 53.0 31270 0.9861 0.7330
0.7715 54.0 31860 0.9533 0.7419
0.8118 55.0 32450 1.0008 0.7125
0.7777 56.0 33040 0.9696 0.7278
0.738 57.0 33630 0.9313 0.7428
0.727 58.0 34220 1.3281 0.7410
0.7597 59.0 34810 1.0580 0.7498
0.7349 60.0 35400 0.8889 0.7343
0.7087 61.0 35990 0.8935 0.7370
0.7298 62.0 36580 0.9416 0.7511
0.7057 63.0 37170 0.8895 0.7428
0.704 64.0 37760 0.8649 0.7379
0.6907 65.0 38350 0.9054 0.7459
0.6721 66.0 38940 1.4102 0.7346
0.6932 67.0 39530 1.3254 0.7453
0.6944 68.0 40120 0.8969 0.7336
0.6504 69.0 40710 0.9343 0.7456
0.6984 70.0 41300 0.8656 0.7434
0.6804 71.0 41890 0.8744 0.7358
0.6684 72.0 42480 1.2043 0.7462
0.6591 73.0 43070 0.8612 0.7450
0.6259 74.0 43660 1.1547 0.7465
0.653 75.0 44250 0.9455 0.7474
0.6503 76.0 44840 0.8475 0.7391
0.65 77.0 45430 0.8667 0.7443
0.6442 78.0 46020 0.8617 0.7465
0.6237 79.0 46610 1.0127 0.7508
0.6149 80.0 47200 0.9956 0.7498
0.5893 81.0 47790 0.9385 0.7462
0.6139 82.0 48380 0.9122 0.7526
0.6117 83.0 48970 0.8413 0.7440
0.5999 84.0 49560 1.0049 0.7468
0.6091 85.0 50150 0.9213 0.7468
0.6049 86.0 50740 0.8642 0.7364
0.5976 87.0 51330 0.9368 0.7498
0.5927 88.0 51920 0.8736 0.7480
0.5698 89.0 52510 0.9112 0.7474
0.569 90.0 53100 0.8784 0.7437
0.5919 91.0 53690 0.8803 0.7456
0.5837 92.0 54280 0.8348 0.7413
0.5699 93.0 54870 0.8705 0.7477
0.5851 94.0 55460 0.8580 0.7471
0.5527 95.0 56050 0.8816 0.7495
0.5719 96.0 56640 0.8519 0.7495
0.5575 97.0 57230 0.8333 0.7450
0.5432 98.0 57820 0.8497 0.7446
0.5425 99.0 58410 0.8369 0.7474
0.5555 100.0 59000 0.8502 0.7459

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
3
Inference API
This model can be loaded on Inference API (serverless).

Dataset used to train Onutoa/1_7e-3_5_0.9