Edit model card

1_8e-3_1_0.1

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9505
  • Accuracy: 0.7318

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.008
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
1.3959 1.0 590 0.9510 0.3786
1.0927 2.0 1180 0.6855 0.4780
0.9921 3.0 1770 1.4020 0.3783
1.039 4.0 2360 0.9930 0.3835
0.877 5.0 2950 1.3595 0.6217
0.8304 6.0 3540 0.6007 0.6648
0.7152 7.0 4130 1.3841 0.4086
0.7225 8.0 4720 0.7135 0.6183
0.6522 9.0 5310 0.5864 0.6966
0.6306 10.0 5900 1.1053 0.6318
0.6533 11.0 6490 0.6681 0.6939
0.5693 12.0 7080 0.6281 0.6777
0.569 13.0 7670 0.6301 0.6523
0.5168 14.0 8260 0.6110 0.6878
0.5071 15.0 8850 0.6350 0.7083
0.5042 16.0 9440 0.6348 0.7183
0.4678 17.0 10030 1.0429 0.6067
0.4545 18.0 10620 0.7921 0.6780
0.4216 19.0 11210 0.6437 0.7245
0.3986 20.0 11800 0.7142 0.7159
0.3871 21.0 12390 0.6949 0.7131
0.3852 22.0 12980 0.6870 0.7235
0.3519 23.0 13570 0.7979 0.7
0.3271 24.0 14160 0.9015 0.6875
0.3136 25.0 14750 0.8513 0.7092
0.278 26.0 15340 0.8899 0.6869
0.2931 27.0 15930 0.7898 0.7150
0.2712 28.0 16520 0.8953 0.7294
0.2494 29.0 17110 0.8243 0.7217
0.2568 30.0 17700 0.8979 0.7156
0.2488 31.0 18290 1.0504 0.7211
0.2568 32.0 18880 0.8953 0.7107
0.2465 33.0 19470 0.8415 0.7208
0.2077 34.0 20060 1.0351 0.7083
0.2202 35.0 20650 0.9620 0.7202
0.2224 36.0 21240 0.8594 0.7251
0.2133 37.0 21830 0.9035 0.7257
0.1881 38.0 22420 0.9327 0.7153
0.201 39.0 23010 0.9521 0.7220
0.197 40.0 23600 0.9997 0.7199
0.1949 41.0 24190 1.0048 0.7355
0.1739 42.0 24780 0.9031 0.7309
0.1781 43.0 25370 1.0229 0.7321
0.1726 44.0 25960 0.9823 0.7183
0.1472 45.0 26550 0.9605 0.7131
0.1628 46.0 27140 0.9855 0.7382
0.1658 47.0 27730 1.0724 0.7272
0.1563 48.0 28320 0.9809 0.7242
0.1682 49.0 28910 0.8878 0.7303
0.1432 50.0 29500 0.9983 0.7324
0.1437 51.0 30090 1.2073 0.6890
0.1431 52.0 30680 1.0315 0.7162
0.142 53.0 31270 1.0895 0.7370
0.1312 54.0 31860 0.9904 0.7355
0.1371 55.0 32450 0.9881 0.7159
0.1383 56.0 33040 0.9876 0.7443
0.128 57.0 33630 1.0126 0.7217
0.1256 58.0 34220 0.9730 0.7370
0.1283 59.0 34810 0.9943 0.7303
0.14 60.0 35400 0.9945 0.7278
0.126 61.0 35990 1.0015 0.7193
0.1232 62.0 36580 1.0385 0.7190
0.1163 63.0 37170 0.9850 0.7180
0.1204 64.0 37760 1.0085 0.7226
0.1157 65.0 38350 1.0784 0.7373
0.1154 66.0 38940 0.9773 0.7330
0.1101 67.0 39530 0.9884 0.7315
0.1138 68.0 40120 0.9496 0.7294
0.1064 69.0 40710 1.0320 0.7303
0.1031 70.0 41300 0.9621 0.7327
0.107 71.0 41890 0.9663 0.7349
0.107 72.0 42480 0.9714 0.7309
0.0958 73.0 43070 1.0255 0.7135
0.0973 74.0 43660 0.9705 0.7349
0.0989 75.0 44250 1.0003 0.7321
0.0968 76.0 44840 1.0130 0.7306
0.0947 77.0 45430 1.0245 0.7300
0.0976 78.0 46020 1.0305 0.7352
0.0916 79.0 46610 0.9644 0.7300
0.0913 80.0 47200 1.0130 0.7373
0.0911 81.0 47790 0.9241 0.7263
0.0985 82.0 48380 0.9843 0.7385
0.0876 83.0 48970 1.0069 0.7327
0.0865 84.0 49560 0.9806 0.7303
0.0872 85.0 50150 0.9590 0.7291
0.0818 86.0 50740 0.9917 0.7251
0.0828 87.0 51330 0.9569 0.7333
0.0813 88.0 51920 0.9769 0.7260
0.0763 89.0 52510 1.0162 0.7333
0.0795 90.0 53100 0.9829 0.7346
0.0788 91.0 53690 0.9755 0.7349
0.0769 92.0 54280 1.0030 0.7315
0.0739 93.0 54870 0.9772 0.7370
0.0782 94.0 55460 0.9850 0.7284
0.0746 95.0 56050 0.9688 0.7309
0.0749 96.0 56640 0.9492 0.7309
0.072 97.0 57230 0.9607 0.7303
0.0693 98.0 57820 0.9686 0.7318
0.0725 99.0 58410 0.9606 0.7312
0.0713 100.0 59000 0.9505 0.7318

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
1
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train Onutoa/1_8e-3_1_0.1