Edit model card

1_6e-3_10_0.5

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9536
  • Accuracy: 0.7596

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.006
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
2.948 1.0 590 2.2396 0.6214
2.5635 2.0 1180 2.2693 0.6275
2.5246 3.0 1770 1.9556 0.6141
2.329 4.0 2360 2.3951 0.4801
2.1726 5.0 2950 1.7234 0.6618
2.0265 6.0 3540 1.5347 0.6679
2.0227 7.0 4130 1.8508 0.6064
1.8725 8.0 4720 2.0863 0.6584
1.8575 9.0 5310 4.0052 0.4639
1.8071 10.0 5900 3.1552 0.6468
1.6655 11.0 6490 1.3147 0.7104
1.501 12.0 7080 1.3005 0.6844
1.538 13.0 7670 1.7051 0.6948
1.4114 14.0 8260 1.4922 0.7028
1.3916 15.0 8850 1.6514 0.7034
1.3373 16.0 9440 1.9420 0.5896
1.271 17.0 10030 2.9731 0.6624
1.3123 18.0 10620 1.4756 0.6609
1.2775 19.0 11210 1.4888 0.6612
1.2341 20.0 11800 1.4493 0.7159
1.1907 21.0 12390 1.7638 0.7110
1.2035 22.0 12980 1.0716 0.7291
1.0365 23.0 13570 1.2975 0.6853
1.1041 24.0 14160 1.0275 0.7220
1.1326 25.0 14750 1.0228 0.7385
1.0261 26.0 15340 1.1473 0.7076
1.0168 27.0 15930 1.0435 0.7205
1.0653 28.0 16520 1.0105 0.7358
0.9418 29.0 17110 1.0397 0.7232
1.0591 30.0 17700 1.3640 0.6917
0.9186 31.0 18290 0.9679 0.7459
0.8665 32.0 18880 1.0310 0.7303
0.9005 33.0 19470 1.0498 0.7235
0.8494 34.0 20060 0.9766 0.7358
0.8474 35.0 20650 1.0077 0.7465
0.7973 36.0 21240 1.0674 0.7428
0.8049 37.0 21830 1.0074 0.7398
0.8241 38.0 22420 0.9613 0.7453
0.7793 39.0 23010 0.9864 0.7398
0.7781 40.0 23600 1.0741 0.7456
0.7539 41.0 24190 0.9809 0.7550
0.7403 42.0 24780 0.9993 0.7339
0.7494 43.0 25370 0.9887 0.7477
0.7091 44.0 25960 1.1792 0.7125
0.7236 45.0 26550 0.9549 0.7443
0.6947 46.0 27140 1.3568 0.7440
0.6928 47.0 27730 1.0682 0.7517
0.6578 48.0 28320 1.0993 0.7486
0.7723 49.0 28910 1.0381 0.7260
0.7169 50.0 29500 0.9510 0.7486
0.6424 51.0 30090 1.0781 0.7281
0.6652 52.0 30680 0.9623 0.7541
0.6274 53.0 31270 0.9476 0.7498
0.6295 54.0 31860 0.9461 0.7474
0.6252 55.0 32450 1.0873 0.7278
0.632 56.0 33040 0.9470 0.7492
0.5865 57.0 33630 1.4737 0.7355
0.6029 58.0 34220 1.0871 0.7477
0.5935 59.0 34810 1.0781 0.7514
0.6023 60.0 35400 0.9968 0.7581
0.5849 61.0 35990 1.0700 0.7547
0.5813 62.0 36580 1.2525 0.7425
0.5557 63.0 37170 0.9643 0.7541
0.541 64.0 37760 1.0179 0.7547
0.5693 65.0 38350 1.0064 0.7401
0.5562 66.0 38940 1.2333 0.7367
0.5677 67.0 39530 0.9976 0.7388
0.5357 68.0 40120 0.9795 0.7413
0.5372 69.0 40710 1.1113 0.7462
0.5563 70.0 41300 1.1366 0.7492
0.5377 71.0 41890 0.9343 0.7502
0.5442 72.0 42480 1.1735 0.7465
0.5124 73.0 43070 0.9499 0.7514
0.5007 74.0 43660 1.2104 0.7456
0.5094 75.0 44250 0.9865 0.7474
0.5118 76.0 44840 1.0542 0.7474
0.5166 77.0 45430 0.9762 0.7615
0.5071 78.0 46020 0.9333 0.7581
0.4961 79.0 46610 1.0310 0.7535
0.4863 80.0 47200 1.0242 0.7492
0.4801 81.0 47790 1.0528 0.7535
0.4975 82.0 48380 1.0188 0.7554
0.4868 83.0 48970 0.9455 0.7596
0.4661 84.0 49560 0.9841 0.7557
0.4765 85.0 50150 0.9570 0.7538
0.4732 86.0 50740 1.0383 0.7535
0.4846 87.0 51330 0.9560 0.7587
0.4641 88.0 51920 0.9716 0.7578
0.477 89.0 52510 0.9581 0.7606
0.4567 90.0 53100 0.9674 0.7569
0.4567 91.0 53690 0.9718 0.7587
0.4676 92.0 54280 0.9535 0.7520
0.4532 93.0 54870 0.9593 0.7563
0.4727 94.0 55460 0.9611 0.7584
0.4535 95.0 56050 0.9539 0.7602
0.4569 96.0 56640 0.9506 0.7587
0.4417 97.0 57230 0.9616 0.7584
0.4314 98.0 57820 0.9488 0.7593
0.4318 99.0 58410 0.9439 0.7587
0.4415 100.0 59000 0.9536 0.7596

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
13

Dataset used to train Onutoa/1_6e-3_10_0.5