Edit model card

1_6e-3_10_0.1

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9853
  • Accuracy: 0.7416

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.006
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
1.4161 1.0 590 1.9327 0.6217
1.4964 2.0 1180 1.4733 0.6217
1.4294 3.0 1770 1.3770 0.6217
1.3196 4.0 2360 1.1956 0.4070
1.1661 5.0 2950 0.9866 0.6333
1.1565 6.0 3540 0.9164 0.6453
1.0435 7.0 4130 1.0146 0.5786
1.0861 8.0 4720 0.8707 0.6541
1.0246 9.0 5310 0.9747 0.6728
0.9761 10.0 5900 1.0055 0.6560
0.9672 11.0 6490 0.7808 0.6869
0.8746 12.0 7080 0.8158 0.6768
0.8883 13.0 7670 0.7982 0.6917
0.8257 14.0 8260 0.9875 0.6869
0.8053 15.0 8850 0.9210 0.7171
0.7995 16.0 9440 0.7910 0.7168
0.7376 17.0 10030 0.8382 0.7122
0.6743 18.0 10620 1.0620 0.6141
0.6343 19.0 11210 0.7421 0.7245
0.6499 20.0 11800 0.7841 0.7187
0.5897 21.0 12390 0.9551 0.6713
0.6163 22.0 12980 1.0281 0.7135
0.5617 23.0 13570 0.9252 0.7245
0.5282 24.0 14160 0.8599 0.7080
0.5402 25.0 14750 0.8381 0.7254
0.493 26.0 15340 1.0387 0.6657
0.474 27.0 15930 0.7978 0.7266
0.4658 28.0 16520 0.8697 0.7306
0.4624 29.0 17110 0.8746 0.7287
0.4333 30.0 17700 0.9256 0.7254
0.4324 31.0 18290 0.8635 0.7336
0.4352 32.0 18880 1.0482 0.7232
0.4144 33.0 19470 1.2383 0.6872
0.3822 34.0 20060 0.9361 0.7324
0.3549 35.0 20650 0.9758 0.7180
0.3597 36.0 21240 1.1784 0.7239
0.3598 37.0 21830 0.9757 0.7336
0.3421 38.0 22420 1.3951 0.7245
0.3309 39.0 23010 1.1202 0.7401
0.3209 40.0 23600 0.9882 0.7358
0.3214 41.0 24190 0.9997 0.7343
0.3101 42.0 24780 0.8871 0.7376
0.2913 43.0 25370 1.0116 0.7401
0.2884 44.0 25960 1.1248 0.7291
0.2761 45.0 26550 0.8363 0.7291
0.2761 46.0 27140 1.0666 0.7202
0.2674 47.0 27730 1.0285 0.7416
0.2647 48.0 28320 0.9575 0.7300
0.2662 49.0 28910 0.9258 0.7373
0.2726 50.0 29500 1.0936 0.7346
0.2461 51.0 30090 1.0192 0.7196
0.2485 52.0 30680 1.0543 0.7382
0.245 53.0 31270 0.9507 0.7336
0.2377 54.0 31860 0.8907 0.7361
0.2379 55.0 32450 0.9788 0.7327
0.2335 56.0 33040 1.0168 0.7413
0.2251 57.0 33630 1.0117 0.7346
0.2293 58.0 34220 0.9280 0.7336
0.2211 59.0 34810 0.9735 0.7401
0.2236 60.0 35400 0.9822 0.7404
0.2123 61.0 35990 1.0189 0.7346
0.207 62.0 36580 1.0436 0.7401
0.2059 63.0 37170 0.9571 0.7410
0.2052 64.0 37760 1.0027 0.7419
0.193 65.0 38350 0.9395 0.7413
0.2099 66.0 38940 1.0325 0.7358
0.1968 67.0 39530 1.0441 0.7398
0.1887 68.0 40120 1.1337 0.7413
0.1911 69.0 40710 1.0438 0.7382
0.1955 70.0 41300 1.0361 0.7394
0.1998 71.0 41890 1.0202 0.7349
0.1944 72.0 42480 1.0261 0.7407
0.1755 73.0 43070 1.0091 0.7422
0.1836 74.0 43660 0.9986 0.7425
0.1856 75.0 44250 0.9461 0.7404
0.187 76.0 44840 0.9383 0.7385
0.1873 77.0 45430 1.0445 0.7416
0.1763 78.0 46020 1.0263 0.7410
0.1749 79.0 46610 0.9650 0.7370
0.1728 80.0 47200 0.9903 0.7343
0.1668 81.0 47790 1.0391 0.7382
0.1693 82.0 48380 0.9794 0.7346
0.1665 83.0 48970 1.0463 0.7355
0.1609 84.0 49560 0.9976 0.7373
0.165 85.0 50150 1.0040 0.7404
0.1622 86.0 50740 1.0184 0.7419
0.1615 87.0 51330 0.9825 0.7336
0.1624 88.0 51920 0.9889 0.7394
0.1557 89.0 52510 0.9938 0.7370
0.1515 90.0 53100 1.0207 0.7385
0.1565 91.0 53690 1.0081 0.7401
0.1582 92.0 54280 0.9308 0.7364
0.1513 93.0 54870 0.9795 0.7398
0.1572 94.0 55460 0.9688 0.7382
0.1514 95.0 56050 1.0002 0.7410
0.1546 96.0 56640 0.9869 0.7401
0.1534 97.0 57230 0.9694 0.7370
0.1405 98.0 57820 0.9705 0.7404
0.149 99.0 58410 0.9859 0.7413
0.1456 100.0 59000 0.9853 0.7416

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
2
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train Onutoa/1_6e-3_10_0.1