Edit model card

1_5e-3_10_0.5

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9119
  • Accuracy: 0.7446

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.005
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
2.6814 1.0 590 2.2524 0.6128
2.6474 2.0 1180 2.2889 0.6217
2.7373 3.0 1770 3.8911 0.4401
2.7048 4.0 2360 2.6859 0.6214
2.3193 5.0 2950 3.0408 0.6217
2.0191 6.0 3540 2.0926 0.5706
1.9595 7.0 4130 1.7082 0.6908
1.833 8.0 4720 1.7816 0.6092
1.7395 9.0 5310 1.6251 0.6281
1.7038 10.0 5900 2.6889 0.6554
1.7975 11.0 6490 1.5326 0.6994
1.5534 12.0 7080 2.6513 0.5554
1.5833 13.0 7670 1.5617 0.6410
1.4585 14.0 8260 1.8289 0.6171
1.4375 15.0 8850 1.6306 0.6517
1.3418 16.0 9440 1.2628 0.7153
1.2576 17.0 10030 1.4116 0.7098
1.2068 18.0 10620 1.1643 0.7089
1.1781 19.0 11210 1.4702 0.7083
1.1497 20.0 11800 1.1550 0.6988
1.0552 21.0 12390 1.0861 0.7284
1.047 22.0 12980 1.0821 0.7205
1.0036 23.0 13570 1.1193 0.7193
0.9589 24.0 14160 1.3591 0.7135
0.9604 25.0 14750 1.0030 0.7229
0.9283 26.0 15340 1.1469 0.7031
0.9242 27.0 15930 1.0466 0.7318
0.8703 28.0 16520 1.0736 0.7343
0.858 29.0 17110 1.0357 0.7183
0.8267 30.0 17700 0.9936 0.7339
0.8148 31.0 18290 0.9989 0.7321
0.7981 32.0 18880 1.0559 0.7404
0.7956 33.0 19470 1.0207 0.7217
0.7817 34.0 20060 0.9636 0.7361
0.7545 35.0 20650 0.9415 0.7324
0.7372 36.0 21240 1.0793 0.7413
0.7317 37.0 21830 1.2911 0.7315
0.7411 38.0 22420 0.9517 0.7364
0.7093 39.0 23010 1.0133 0.7382
0.6838 40.0 23600 1.1835 0.7401
0.6773 41.0 24190 0.9180 0.7379
0.6776 42.0 24780 0.9410 0.7367
0.6486 43.0 25370 0.9836 0.7419
0.6527 44.0 25960 0.9721 0.7309
0.6465 45.0 26550 0.9508 0.7388
0.6245 46.0 27140 0.9273 0.7434
0.6258 47.0 27730 0.9763 0.7330
0.6086 48.0 28320 0.9135 0.7388
0.6417 49.0 28910 1.0037 0.7446
0.6064 50.0 29500 0.9751 0.7398
0.5938 51.0 30090 0.9801 0.7453
0.5951 52.0 30680 0.9515 0.7370
0.5718 53.0 31270 0.9160 0.7419
0.5751 54.0 31860 0.9263 0.7462
0.5839 55.0 32450 0.9170 0.7376
0.5707 56.0 33040 0.9787 0.7431
0.564 57.0 33630 0.9822 0.7431
0.5539 58.0 34220 0.9335 0.7407
0.5567 59.0 34810 1.0004 0.7370
0.5555 60.0 35400 0.9554 0.7446
0.5344 61.0 35990 0.9199 0.7483
0.5494 62.0 36580 0.9970 0.7456
0.5226 63.0 37170 0.9454 0.7434
0.5275 64.0 37760 0.9771 0.7361
0.5186 65.0 38350 1.0032 0.7517
0.52 66.0 38940 0.9263 0.7440
0.5209 67.0 39530 1.0130 0.7443
0.528 68.0 40120 0.9466 0.7422
0.5146 69.0 40710 0.9790 0.7456
0.5026 70.0 41300 0.9880 0.7489
0.5204 71.0 41890 0.9132 0.7373
0.5049 72.0 42480 0.9589 0.7480
0.4969 73.0 43070 0.9564 0.7446
0.4911 74.0 43660 0.9255 0.7336
0.4961 75.0 44250 0.9983 0.7502
0.4986 76.0 44840 0.9003 0.7376
0.4979 77.0 45430 0.8937 0.7385
0.4941 78.0 46020 0.9082 0.7422
0.487 79.0 46610 0.9231 0.7471
0.4773 80.0 47200 0.9673 0.7437
0.4665 81.0 47790 0.9598 0.7462
0.4824 82.0 48380 0.9110 0.7410
0.4795 83.0 48970 0.9222 0.7425
0.4654 84.0 49560 0.9369 0.7459
0.4605 85.0 50150 0.9379 0.7502
0.477 86.0 50740 0.8911 0.7437
0.4644 87.0 51330 0.9287 0.7434
0.4539 88.0 51920 0.9421 0.7422
0.4582 89.0 52510 0.9248 0.7437
0.4488 90.0 53100 0.9152 0.7425
0.4554 91.0 53690 0.9511 0.7471
0.4547 92.0 54280 0.9064 0.7419
0.4534 93.0 54870 0.9404 0.7471
0.463 94.0 55460 0.9346 0.7453
0.4482 95.0 56050 0.9191 0.7437
0.4518 96.0 56640 0.9154 0.7431
0.4326 97.0 57230 0.9055 0.7440
0.4291 98.0 57820 0.9072 0.7437
0.4278 99.0 58410 0.9101 0.7437
0.4397 100.0 59000 0.9119 0.7446

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
9

Dataset used to train Onutoa/1_5e-3_10_0.5