Edit model card

1_9e-3_10_0.9

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9482
  • Accuracy: 0.7609

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.009
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
5.6057 1.0 590 3.5256 0.6080
4.0044 2.0 1180 5.6686 0.6217
3.6272 3.0 1770 5.5431 0.4936
3.6446 4.0 2360 2.6348 0.6174
3.6031 5.0 2950 2.9075 0.6569
3.7432 6.0 3540 3.3871 0.6269
3.6065 7.0 4130 3.3771 0.5746
3.58 8.0 4720 2.8957 0.6584
3.3121 9.0 5310 8.1931 0.6226
3.01 10.0 5900 2.7215 0.6749
2.6313 11.0 6490 2.0220 0.6716
2.2761 12.0 7080 1.9046 0.7021
2.3338 13.0 7670 1.7751 0.7049
2.0509 14.0 8260 1.7843 0.7012
2.0935 15.0 8850 1.8090 0.7220
1.973 16.0 9440 1.6557 0.7306
1.7967 17.0 10030 1.5804 0.7125
1.6871 18.0 10620 1.4815 0.7092
1.6925 19.0 11210 1.3803 0.7361
1.6661 20.0 11800 1.3541 0.7330
1.6682 21.0 12390 1.5167 0.7373
1.5716 22.0 12980 1.5059 0.7257
1.4106 23.0 13570 1.3936 0.7459
1.355 24.0 14160 1.2851 0.7456
1.3702 25.0 14750 2.6721 0.7196
1.3121 26.0 15340 1.2452 0.7434
1.344 27.0 15930 1.2171 0.7477
1.2869 28.0 16520 1.1749 0.7495
1.2021 29.0 17110 1.1665 0.7505
1.2202 30.0 17700 1.2806 0.7593
1.1704 31.0 18290 1.2779 0.7581
1.134 32.0 18880 1.2130 0.7428
1.1401 33.0 19470 1.2809 0.7388
1.056 34.0 20060 1.1413 0.7511
1.0769 35.0 20650 1.1549 0.7459
1.0196 36.0 21240 1.2625 0.7520
1.0422 37.0 21830 1.3321 0.7578
1.019 38.0 22420 1.2026 0.7315
1.0093 39.0 23010 1.1398 0.7480
1.0076 40.0 23600 1.1515 0.7394
0.943 41.0 24190 1.7801 0.7517
0.9469 42.0 24780 1.0641 0.7480
0.9195 43.0 25370 1.5110 0.7569
0.9116 44.0 25960 1.0370 0.7517
0.9411 45.0 26550 1.0807 0.7563
0.8813 46.0 27140 1.1756 0.7633
0.8775 47.0 27730 1.0464 0.7404
0.8544 48.0 28320 1.0554 0.7560
0.9264 49.0 28910 1.0575 0.7456
0.8571 50.0 29500 1.0595 0.7446
0.8352 51.0 30090 1.1723 0.7284
0.8424 52.0 30680 1.1615 0.7642
0.8264 53.0 31270 1.0357 0.7581
0.7982 54.0 31860 1.0384 0.7557
0.8085 55.0 32450 1.0679 0.7336
0.806 56.0 33040 1.0427 0.7645
0.7658 57.0 33630 1.1823 0.7618
0.7831 58.0 34220 1.1012 0.7719
0.7872 59.0 34810 1.0155 0.7636
0.7976 60.0 35400 0.9905 0.7560
0.7525 61.0 35990 1.1839 0.7654
0.8019 62.0 36580 1.0586 0.7697
0.7394 63.0 37170 1.0845 0.7330
0.7526 64.0 37760 0.9414 0.7593
0.7559 65.0 38350 0.9775 0.7483
0.7389 66.0 38940 0.9854 0.7667
0.7311 67.0 39530 0.9922 0.7471
0.7602 68.0 40120 0.9516 0.7523
0.7174 69.0 40710 0.9789 0.7664
0.7348 70.0 41300 1.1615 0.7645
0.7348 71.0 41890 0.9820 0.7535
0.7364 72.0 42480 0.9867 0.7639
0.6935 73.0 43070 0.9748 0.7627
0.7002 74.0 43660 1.1418 0.7691
0.6987 75.0 44250 1.0109 0.7627
0.6975 76.0 44840 0.9500 0.7550
0.7008 77.0 45430 0.9741 0.7621
0.6976 78.0 46020 1.0055 0.7462
0.7063 79.0 46610 1.0064 0.7654
0.6804 80.0 47200 0.9986 0.7661
0.6574 81.0 47790 0.9993 0.7676
0.6676 82.0 48380 0.9976 0.7664
0.6701 83.0 48970 1.0462 0.7709
0.6589 84.0 49560 1.0222 0.7679
0.6575 85.0 50150 0.9405 0.7532
0.6629 86.0 50740 0.9348 0.7529
0.6634 87.0 51330 1.0199 0.7709
0.6518 88.0 51920 0.9407 0.7599
0.6504 89.0 52510 0.9923 0.7661
0.6294 90.0 53100 0.9581 0.7618
0.6369 91.0 53690 0.9749 0.7609
0.6504 92.0 54280 0.9409 0.7550
0.628 93.0 54870 1.0022 0.7667
0.656 94.0 55460 0.9583 0.7606
0.6251 95.0 56050 0.9518 0.7609
0.6246 96.0 56640 0.9558 0.7612
0.6202 97.0 57230 0.9521 0.7602
0.6179 98.0 57820 0.9481 0.7596
0.6146 99.0 58410 0.9445 0.7618
0.6287 100.0 59000 0.9482 0.7609

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
2
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train Onutoa/1_9e-3_10_0.9