Edit model card

1_7e-3_10_0.9

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0234
  • Accuracy: 0.7489

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.007
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
4.672 1.0 590 3.4123 0.6177
4.4311 2.0 1180 5.0929 0.6214
4.0038 3.0 1770 3.7952 0.5098
3.8065 4.0 2360 3.1004 0.6315
3.2263 5.0 2950 2.3845 0.6300
3.0946 6.0 3540 2.3928 0.6086
2.9117 7.0 4130 2.0231 0.6813
2.6775 8.0 4720 2.6161 0.6804
2.4062 9.0 5310 1.7137 0.6936
2.3951 10.0 5900 1.8127 0.6810
2.3536 11.0 6490 2.0987 0.6606
2.0221 12.0 7080 2.0433 0.7015
2.1095 13.0 7670 1.7162 0.7131
2.1267 14.0 8260 1.7356 0.6914
2.0222 15.0 8850 1.4617 0.7156
1.9885 16.0 9440 1.7415 0.6862
1.9673 17.0 10030 1.6016 0.7015
1.8456 18.0 10620 1.4015 0.7187
1.7587 19.0 11210 1.7362 0.7251
1.7592 20.0 11800 1.2864 0.7199
1.6778 21.0 12390 1.4120 0.7235
1.616 22.0 12980 1.6139 0.7006
1.5334 23.0 13570 1.2988 0.7232
1.5203 24.0 14160 1.2945 0.7327
1.3748 25.0 14750 1.2674 0.7330
1.3845 26.0 15340 1.8066 0.6801
1.3503 27.0 15930 1.2514 0.7382
1.2956 28.0 16520 1.6651 0.7318
1.2932 29.0 17110 1.2601 0.7226
1.2334 30.0 17700 1.5139 0.7138
1.2565 31.0 18290 2.0072 0.7147
1.2092 32.0 18880 1.1748 0.7306
1.186 33.0 19470 1.1183 0.7333
1.1183 34.0 20060 1.1140 0.7428
1.0961 35.0 20650 1.1484 0.7339
1.0555 36.0 21240 1.4733 0.7453
1.0585 37.0 21830 1.2398 0.7474
1.0807 38.0 22420 1.1234 0.7471
1.0645 39.0 23010 1.1409 0.7456
0.9934 40.0 23600 1.3541 0.7443
0.9803 41.0 24190 1.7554 0.7370
0.9898 42.0 24780 1.1386 0.7453
0.9501 43.0 25370 1.3676 0.7489
0.9705 44.0 25960 1.0755 0.7456
0.9334 45.0 26550 1.2745 0.7502
0.9007 46.0 27140 1.0855 0.7477
0.9046 47.0 27730 1.1565 0.7498
0.88 48.0 28320 1.0612 0.7437
0.926 49.0 28910 1.1183 0.7425
0.8811 50.0 29500 1.1039 0.7318
0.8709 51.0 30090 1.1250 0.7538
0.8556 52.0 30680 1.0592 0.7394
0.8512 53.0 31270 1.0862 0.7474
0.8362 54.0 31860 1.2152 0.7468
0.8768 55.0 32450 1.0738 0.7459
0.8381 56.0 33040 1.1042 0.7547
0.7802 57.0 33630 1.1756 0.7492
0.8208 58.0 34220 1.0352 0.7508
0.8051 59.0 34810 1.0881 0.7385
0.8036 60.0 35400 1.0601 0.7489
0.7717 61.0 35990 1.2949 0.7492
0.7779 62.0 36580 1.0747 0.7498
0.7819 63.0 37170 1.0378 0.7474
0.7523 64.0 37760 1.0439 0.7419
0.7646 65.0 38350 1.0399 0.7544
0.7356 66.0 38940 1.0281 0.7477
0.7655 67.0 39530 1.0816 0.7495
0.7513 68.0 40120 1.0422 0.7471
0.7481 69.0 40710 1.1219 0.7547
0.7421 70.0 41300 1.0517 0.7547
0.7581 71.0 41890 1.0355 0.7437
0.7517 72.0 42480 1.0571 0.7532
0.7226 73.0 43070 1.0588 0.7505
0.7241 74.0 43660 1.0405 0.7450
0.7439 75.0 44250 1.1031 0.7489
0.7295 76.0 44840 1.0134 0.7471
0.7208 77.0 45430 1.0672 0.7532
0.7239 78.0 46020 1.0190 0.7495
0.7338 79.0 46610 1.0243 0.7508
0.6933 80.0 47200 1.0926 0.7557
0.679 81.0 47790 1.1154 0.7529
0.6926 82.0 48380 1.0526 0.7502
0.6828 83.0 48970 1.0804 0.7529
0.671 84.0 49560 1.1215 0.7584
0.679 85.0 50150 1.0317 0.7498
0.6874 86.0 50740 1.0031 0.7456
0.6873 87.0 51330 1.1015 0.7532
0.6686 88.0 51920 1.0376 0.7474
0.6708 89.0 52510 1.1052 0.7544
0.6697 90.0 53100 1.0084 0.7514
0.6581 91.0 53690 1.0611 0.7538
0.6722 92.0 54280 1.0155 0.7446
0.6714 93.0 54870 1.0882 0.7514
0.6674 94.0 55460 1.0447 0.7502
0.6553 95.0 56050 1.0315 0.7486
0.6488 96.0 56640 1.0389 0.7508
0.6409 97.0 57230 1.0196 0.7486
0.6225 98.0 57820 1.0354 0.7492
0.6316 99.0 58410 1.0182 0.7495
0.6342 100.0 59000 1.0234 0.7489

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
4
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train Onutoa/1_7e-3_10_0.9