Edit model card

1_8e-3_1_0.5

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5223
  • Accuracy: 0.7101

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.008
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
1.047 1.0 590 0.5930 0.6147
1.1566 2.0 1180 0.8138 0.3786
0.8071 3.0 1770 1.1906 0.6217
0.8515 4.0 2360 0.5963 0.5232
0.7727 5.0 2950 0.5584 0.6043
0.864 6.0 3540 1.9242 0.3783
0.7792 7.0 4130 0.7053 0.5116
0.768 8.0 4720 2.9011 0.3783
0.7931 9.0 5310 0.6747 0.3783
0.726 10.0 5900 5.3441 0.3783
0.7177 11.0 6490 0.7048 0.3783
0.6681 12.0 7080 0.6229 0.3783
0.6889 13.0 7670 1.0114 0.6205
0.6618 14.0 8260 2.8718 0.6217
0.6566 15.0 8850 1.5485 0.6217
0.6227 16.0 9440 0.7295 0.6220
0.6016 17.0 10030 0.6356 0.6217
0.5891 18.0 10620 0.9814 0.6266
0.5534 19.0 11210 1.4086 0.6205
0.5574 20.0 11800 1.9522 0.6211
0.5349 21.0 12390 0.5543 0.6355
0.5171 22.0 12980 0.5258 0.6780
0.5043 23.0 13570 0.7235 0.4746
0.4775 24.0 14160 0.5588 0.6428
0.4721 25.0 14750 0.5342 0.6731
0.461 26.0 15340 0.7023 0.5560
0.461 27.0 15930 1.0768 0.4144
0.4312 28.0 16520 0.5149 0.6798
0.4378 29.0 17110 0.8702 0.5226
0.4214 30.0 17700 0.8323 0.6514
0.4205 31.0 18290 0.4795 0.6869
0.3944 32.0 18880 0.4763 0.6969
0.3874 33.0 19470 1.5854 0.6248
0.3779 34.0 20060 0.5091 0.6914
0.3723 35.0 20650 0.7588 0.6541
0.3693 36.0 21240 0.7886 0.5128
0.3602 37.0 21830 1.4420 0.4719
0.3522 38.0 22420 0.9082 0.5073
0.3488 39.0 23010 0.6001 0.6853
0.3348 40.0 23600 0.6879 0.6492
0.3482 41.0 24190 1.7803 0.6315
0.3324 42.0 24780 0.5648 0.6997
0.3318 43.0 25370 0.9623 0.6618
0.336 44.0 25960 0.6179 0.6459
0.3167 45.0 26550 0.5041 0.6997
0.3069 46.0 27140 0.4954 0.7003
0.3078 47.0 27730 0.5356 0.7028
0.2981 48.0 28320 1.3955 0.6450
0.3037 49.0 28910 0.5689 0.6878
0.2887 50.0 29500 0.8592 0.5517
0.28 51.0 30090 0.5939 0.6838
0.2786 52.0 30680 0.6514 0.6765
0.2778 53.0 31270 1.8380 0.6339
0.2797 54.0 31860 1.1076 0.6440
0.2773 55.0 32450 0.4983 0.6972
0.2746 56.0 33040 1.5742 0.4483
0.2691 57.0 33630 0.8767 0.6498
0.2555 58.0 34220 0.6028 0.6113
0.2675 59.0 34810 0.7268 0.6664
0.2567 60.0 35400 0.5953 0.6593
0.2555 61.0 35990 0.5564 0.6795
0.2525 62.0 36580 0.7419 0.6009
0.2451 63.0 37170 0.5019 0.7043
0.2431 64.0 37760 0.5603 0.6997
0.2373 65.0 38350 0.5755 0.6612
0.2387 66.0 38940 0.6158 0.6254
0.2433 67.0 39530 0.5994 0.6150
0.2354 68.0 40120 0.5195 0.7101
0.2361 69.0 40710 0.5164 0.7076
0.234 70.0 41300 0.5001 0.6997
0.2341 71.0 41890 1.0352 0.4728
0.2245 72.0 42480 0.5045 0.7073
0.2219 73.0 43070 0.5208 0.7080
0.216 74.0 43660 0.5116 0.7061
0.2227 75.0 44250 0.5224 0.7089
0.2163 76.0 44840 0.6881 0.5960
0.217 77.0 45430 0.5131 0.7
0.2209 78.0 46020 0.5344 0.7086
0.2094 79.0 46610 0.6909 0.6098
0.21 80.0 47200 0.7910 0.5829
0.2069 81.0 47790 0.7681 0.6575
0.2021 82.0 48380 0.5345 0.7083
0.2077 83.0 48970 0.5224 0.7043
0.2002 84.0 49560 0.5126 0.7015
0.2033 85.0 50150 0.5920 0.7003
0.2021 86.0 50740 0.5589 0.7040
0.1873 87.0 51330 0.5470 0.7101
0.1972 88.0 51920 0.5276 0.7040
0.1855 89.0 52510 0.5280 0.7049
0.1916 90.0 53100 0.5261 0.7046
0.1912 91.0 53690 0.5950 0.6569
0.1917 92.0 54280 0.5402 0.6850
0.1879 93.0 54870 0.5765 0.7037
0.1923 94.0 55460 0.5297 0.6991
0.1894 95.0 56050 0.5150 0.7083
0.1853 96.0 56640 0.5276 0.6976
0.1848 97.0 57230 0.5356 0.7113
0.1796 98.0 57820 0.5585 0.7086
0.1848 99.0 58410 0.5230 0.7101
0.1849 100.0 59000 0.5223 0.7101

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
2
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train Onutoa/1_8e-3_1_0.5