Edit model card

1_7e-3_1_0.5

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4732
  • Accuracy: 0.7462

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.007
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
0.9787 1.0 590 0.7825 0.6217
1.0111 2.0 1180 0.7676 0.6021
0.9238 3.0 1770 0.6005 0.6217
0.8313 4.0 2360 0.6038 0.4321
0.7671 5.0 2950 0.9066 0.6217
0.7472 6.0 3540 0.6074 0.4560
0.7577 7.0 4130 0.6978 0.3807
0.6835 8.0 4720 0.6612 0.6217
0.6855 9.0 5310 0.7161 0.6217
0.6572 10.0 5900 0.5321 0.6370
0.6389 11.0 6490 0.5122 0.6621
0.5993 12.0 7080 0.5795 0.6612
0.587 13.0 7670 0.5287 0.6245
0.5662 14.0 8260 0.4982 0.6664
0.5474 15.0 8850 0.5174 0.6453
0.5533 16.0 9440 0.5125 0.6890
0.5201 17.0 10030 0.4753 0.6716
0.5055 18.0 10620 0.4841 0.6755
0.4886 19.0 11210 0.4682 0.7028
0.4806 20.0 11800 0.4591 0.6905
0.456 21.0 12390 0.4729 0.6896
0.4627 22.0 12980 0.4434 0.7003
0.4301 23.0 13570 0.4426 0.7092
0.4203 24.0 14160 0.4324 0.7092
0.4175 25.0 14750 0.4642 0.7275
0.3993 26.0 15340 0.5582 0.6459
0.3972 27.0 15930 0.4367 0.7076
0.3812 28.0 16520 0.4484 0.7278
0.3726 29.0 17110 0.4581 0.7202
0.3781 30.0 17700 0.4322 0.7275
0.3578 31.0 18290 0.4970 0.7217
0.3458 32.0 18880 0.6182 0.7095
0.3434 33.0 19470 0.4644 0.7095
0.3338 34.0 20060 0.4355 0.7199
0.3344 35.0 20650 0.4495 0.7223
0.3308 36.0 21240 0.4515 0.7330
0.3208 37.0 21830 0.4562 0.7373
0.3012 38.0 22420 0.4464 0.7211
0.3055 39.0 23010 0.4410 0.7382
0.306 40.0 23600 0.5016 0.7343
0.2894 41.0 24190 0.4726 0.7364
0.2834 42.0 24780 0.4714 0.7379
0.2789 43.0 25370 0.4379 0.7199
0.2759 44.0 25960 0.4570 0.7287
0.2667 45.0 26550 0.4500 0.7294
0.2564 46.0 27140 0.4628 0.7413
0.2541 47.0 27730 0.4643 0.7379
0.2498 48.0 28320 0.4406 0.7336
0.2571 49.0 28910 0.4427 0.7373
0.2423 50.0 29500 0.4658 0.7315
0.2374 51.0 30090 0.4744 0.7214
0.2415 52.0 30680 0.5416 0.7373
0.2309 53.0 31270 0.4830 0.7226
0.2282 54.0 31860 0.4758 0.7343
0.2307 55.0 32450 0.4698 0.7266
0.2213 56.0 33040 0.4458 0.7446
0.2193 57.0 33630 0.4778 0.7382
0.214 58.0 34220 0.4828 0.7456
0.207 59.0 34810 0.4818 0.7294
0.21 60.0 35400 0.4614 0.7508
0.2118 61.0 35990 0.4507 0.7480
0.2031 62.0 36580 0.4718 0.7416
0.1987 63.0 37170 0.4752 0.7324
0.2018 64.0 37760 0.4431 0.7388
0.1889 65.0 38350 0.4769 0.7385
0.1941 66.0 38940 0.4623 0.7443
0.1898 67.0 39530 0.4818 0.7355
0.1872 68.0 40120 0.4678 0.7446
0.1813 69.0 40710 0.4843 0.7529
0.1893 70.0 41300 0.4702 0.7459
0.1885 71.0 41890 0.4931 0.7193
0.1811 72.0 42480 0.4854 0.7477
0.1755 73.0 43070 0.4848 0.7373
0.1768 74.0 43660 0.4867 0.7520
0.1728 75.0 44250 0.5011 0.7477
0.1791 76.0 44840 0.4876 0.7416
0.1733 77.0 45430 0.4920 0.7486
0.1745 78.0 46020 0.4711 0.7492
0.1741 79.0 46610 0.4661 0.7401
0.1706 80.0 47200 0.4670 0.7422
0.165 81.0 47790 0.4736 0.7459
0.1612 82.0 48380 0.4660 0.7459
0.1722 83.0 48970 0.4772 0.7410
0.1638 84.0 49560 0.4767 0.7434
0.1613 85.0 50150 0.4641 0.7391
0.1649 86.0 50740 0.4783 0.7450
0.1609 87.0 51330 0.4734 0.7453
0.1588 88.0 51920 0.4919 0.7508
0.1601 89.0 52510 0.4698 0.7453
0.1573 90.0 53100 0.4765 0.7508
0.1584 91.0 53690 0.4754 0.7492
0.1587 92.0 54280 0.4704 0.7413
0.1521 93.0 54870 0.4865 0.7505
0.1546 94.0 55460 0.4777 0.7505
0.1539 95.0 56050 0.4791 0.7526
0.1545 96.0 56640 0.4721 0.7456
0.1533 97.0 57230 0.4725 0.7407
0.1476 98.0 57820 0.4709 0.7462
0.1489 99.0 58410 0.4731 0.7459
0.1501 100.0 59000 0.4732 0.7462

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
4

Dataset used to train Onutoa/1_7e-3_1_0.5