Edit model card

1_6e-3_1_0.5

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4885
  • Accuracy: 0.7401

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.006
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
0.9248 1.0 590 0.7400 0.3786
0.8836 2.0 1180 0.7971 0.3914
0.8513 3.0 1770 0.6664 0.6217
0.7488 4.0 2360 0.7384 0.6217
0.729 5.0 2950 1.0125 0.6217
0.7097 6.0 3540 0.7106 0.5046
0.6521 7.0 4130 0.5533 0.6098
0.6704 8.0 4720 0.4852 0.6587
0.6271 9.0 5310 0.5153 0.6850
0.6134 10.0 5900 0.4555 0.6948
0.5702 11.0 6490 0.4732 0.6716
0.5428 12.0 7080 0.4548 0.6963
0.5681 13.0 7670 0.4534 0.6859
0.5238 14.0 8260 0.6556 0.6725
0.5103 15.0 8850 0.5050 0.7110
0.5004 16.0 9440 0.4638 0.6813
0.4614 17.0 10030 0.4935 0.7113
0.4702 18.0 10620 0.4570 0.7040
0.4305 19.0 11210 0.4871 0.7190
0.4402 20.0 11800 0.5026 0.6722
0.4035 21.0 12390 0.4476 0.7208
0.3907 22.0 12980 0.6030 0.6367
0.3686 23.0 13570 0.4396 0.7131
0.3765 24.0 14160 0.4589 0.7180
0.3709 25.0 14750 0.4440 0.7107
0.3446 26.0 15340 1.0145 0.5728
0.3433 27.0 15930 0.6213 0.6627
0.331 28.0 16520 0.4566 0.7144
0.3373 29.0 17110 0.5484 0.7284
0.3117 30.0 17700 0.6371 0.6648
0.2988 31.0 18290 0.7013 0.7089
0.2928 32.0 18880 0.4553 0.7281
0.297 33.0 19470 0.5225 0.6976
0.2808 34.0 20060 0.4951 0.7343
0.2735 35.0 20650 0.5188 0.7095
0.2624 36.0 21240 0.4961 0.7367
0.2642 37.0 21830 0.4731 0.7254
0.2548 38.0 22420 0.4635 0.7260
0.2575 39.0 23010 0.4896 0.7073
0.244 40.0 23600 0.5605 0.7358
0.2472 41.0 24190 0.6450 0.7266
0.2433 42.0 24780 0.4922 0.7367
0.2312 43.0 25370 0.5115 0.7269
0.2355 44.0 25960 0.4879 0.7388
0.2204 45.0 26550 0.5023 0.7355
0.2223 46.0 27140 0.4976 0.7355
0.22 47.0 27730 0.5051 0.7364
0.2056 48.0 28320 0.4973 0.7205
0.2166 49.0 28910 0.5008 0.7180
0.2129 50.0 29500 0.5323 0.7382
0.1973 51.0 30090 0.5689 0.6908
0.2025 52.0 30680 0.4855 0.7367
0.1977 53.0 31270 0.5230 0.7211
0.1946 54.0 31860 0.5969 0.7333
0.2063 55.0 32450 0.5340 0.7098
0.1967 56.0 33040 0.5589 0.7361
0.1793 57.0 33630 0.5207 0.7358
0.1872 58.0 34220 0.4926 0.7394
0.1831 59.0 34810 0.5265 0.7434
0.1808 60.0 35400 0.5113 0.7407
0.1892 61.0 35990 0.4972 0.7416
0.1795 62.0 36580 0.5121 0.7391
0.172 63.0 37170 0.4857 0.7321
0.176 64.0 37760 0.5014 0.7232
0.1763 65.0 38350 0.5061 0.7370
0.1753 66.0 38940 0.4840 0.7358
0.1716 67.0 39530 0.5262 0.7361
0.1675 68.0 40120 0.4844 0.7324
0.1647 69.0 40710 0.5357 0.7440
0.1702 70.0 41300 0.4852 0.7394
0.1666 71.0 41890 0.4749 0.7391
0.162 72.0 42480 0.5616 0.7385
0.1546 73.0 43070 0.5089 0.7352
0.1525 74.0 43660 0.5315 0.7382
0.1595 75.0 44250 0.5300 0.7419
0.1555 76.0 44840 0.5664 0.7407
0.1604 77.0 45430 0.5057 0.7416
0.1584 78.0 46020 0.5008 0.7355
0.1574 79.0 46610 0.5206 0.7398
0.1552 80.0 47200 0.5176 0.7361
0.1501 81.0 47790 0.4955 0.7376
0.1492 82.0 48380 0.5001 0.7391
0.1508 83.0 48970 0.4963 0.7379
0.1463 84.0 49560 0.5148 0.7413
0.1449 85.0 50150 0.4868 0.7349
0.1489 86.0 50740 0.5012 0.7419
0.1415 87.0 51330 0.4963 0.7321
0.145 88.0 51920 0.5046 0.7291
0.1375 89.0 52510 0.5011 0.7416
0.1387 90.0 53100 0.5041 0.7440
0.1428 91.0 53690 0.4940 0.7425
0.1442 92.0 54280 0.4912 0.7401
0.139 93.0 54870 0.5014 0.7428
0.1406 94.0 55460 0.4919 0.7391
0.1387 95.0 56050 0.5063 0.7446
0.1368 96.0 56640 0.4902 0.7410
0.1391 97.0 57230 0.4947 0.7407
0.136 98.0 57820 0.4922 0.7413
0.133 99.0 58410 0.4926 0.7394
0.1379 100.0 59000 0.4885 0.7401

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
2
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train Onutoa/1_6e-3_1_0.5