Edit model card

1_8e-3_10_0.5

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9754
  • Accuracy: 0.7459

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.008
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
3.0295 1.0 590 5.2308 0.6217
3.1648 2.0 1180 2.6673 0.3908
2.5921 3.0 1770 5.0497 0.3761
2.9042 4.0 2360 2.2586 0.6291
2.4411 5.0 2950 6.5105 0.6217
2.3131 6.0 3540 2.7244 0.5183
2.0563 7.0 4130 4.6938 0.3783
1.9468 8.0 4720 1.5045 0.6862
1.9269 9.0 5310 1.7666 0.6734
1.9701 10.0 5900 1.8173 0.6780
1.8231 11.0 6490 1.6929 0.6752
1.7563 12.0 7080 1.3455 0.6862
1.726 13.0 7670 1.2870 0.6786
1.6706 14.0 8260 1.3862 0.6951
1.5876 15.0 8850 1.4384 0.6587
1.5067 16.0 9440 1.5336 0.6985
1.5777 17.0 10030 1.9860 0.5972
1.4323 18.0 10620 1.2068 0.7076
1.4228 19.0 11210 1.8071 0.6780
1.4335 20.0 11800 4.1127 0.6346
1.4549 21.0 12390 1.2302 0.7131
1.277 22.0 12980 1.2829 0.6771
1.2962 23.0 13570 1.2152 0.7070
1.4076 24.0 14160 1.5758 0.6529
1.3427 25.0 14750 1.1333 0.6997
1.1936 26.0 15340 1.1974 0.6917
1.1937 27.0 15930 1.2653 0.6948
1.2784 28.0 16520 1.0620 0.7242
1.1605 29.0 17110 2.7859 0.6734
1.1438 30.0 17700 1.8633 0.6428
1.1406 31.0 18290 1.6275 0.7098
1.0993 32.0 18880 1.2765 0.6969
1.158 33.0 19470 1.1218 0.7058
1.0432 34.0 20060 1.0562 0.7245
1.0295 35.0 20650 1.3146 0.7251
1.0041 36.0 21240 1.0308 0.7150
1.0104 37.0 21830 1.0149 0.7242
1.0096 38.0 22420 1.1232 0.7083
0.9661 39.0 23010 1.0316 0.7251
0.9183 40.0 23600 1.2166 0.7055
0.9298 41.0 24190 1.9118 0.7040
0.8799 42.0 24780 1.0190 0.7306
0.954 43.0 25370 1.0761 0.7263
0.853 44.0 25960 1.2006 0.7080
1.0647 45.0 26550 1.1605 0.7379
0.8562 46.0 27140 1.2208 0.7122
0.8421 47.0 27730 0.9974 0.7388
0.7865 48.0 28320 1.1207 0.7376
0.8998 49.0 28910 1.1221 0.7080
0.8044 50.0 29500 1.0191 0.7205
0.7771 51.0 30090 0.9921 0.7364
0.7886 52.0 30680 1.1379 0.7419
0.7756 53.0 31270 1.3039 0.7315
0.7232 54.0 31860 1.1143 0.7385
0.69 55.0 32450 1.1024 0.7239
0.7313 56.0 33040 1.3560 0.7370
0.7266 57.0 33630 0.9763 0.7431
0.7084 58.0 34220 1.4480 0.7291
0.7072 59.0 34810 1.4463 0.7336
0.6889 60.0 35400 1.2983 0.7330
0.6745 61.0 35990 0.9898 0.7413
0.6739 62.0 36580 0.9817 0.7373
0.6513 63.0 37170 0.9999 0.7391
0.6665 64.0 37760 0.9840 0.7367
0.6428 65.0 38350 1.0120 0.7284
0.6418 66.0 38940 1.0021 0.7401
0.6185 67.0 39530 1.0063 0.7327
0.6259 68.0 40120 1.0108 0.7339
0.6165 69.0 40710 1.0279 0.7440
0.6393 70.0 41300 1.1899 0.7183
0.5869 71.0 41890 0.9767 0.7333
0.605 72.0 42480 1.4097 0.7367
0.5906 73.0 43070 1.0036 0.7358
0.5704 74.0 43660 1.3105 0.7443
0.5872 75.0 44250 1.0241 0.7242
0.5755 76.0 44840 1.1519 0.7410
0.5967 77.0 45430 1.1481 0.7431
0.57 78.0 46020 1.0164 0.7398
0.5599 79.0 46610 1.1657 0.7391
0.5458 80.0 47200 1.1020 0.7422
0.5299 81.0 47790 1.0836 0.7437
0.5285 82.0 48380 0.9682 0.7391
0.538 83.0 48970 1.1895 0.7193
0.5277 84.0 49560 0.9778 0.7459
0.525 85.0 50150 0.9893 0.7364
0.5268 86.0 50740 0.9745 0.7434
0.518 87.0 51330 0.9654 0.7450
0.5212 88.0 51920 0.9665 0.7382
0.5132 89.0 52510 1.0605 0.7474
0.5155 90.0 53100 0.9605 0.7440
0.4986 91.0 53690 1.0163 0.7480
0.5004 92.0 54280 1.0187 0.7312
0.4846 93.0 54870 0.9721 0.7440
0.4963 94.0 55460 1.0295 0.7468
0.4759 95.0 56050 1.0004 0.7468
0.4905 96.0 56640 1.0361 0.7474
0.4994 97.0 57230 0.9591 0.7446
0.4673 98.0 57820 0.9604 0.7431
0.4734 99.0 58410 0.9771 0.7462
0.4588 100.0 59000 0.9754 0.7459

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
11

Dataset used to train Onutoa/1_8e-3_10_0.5