Edit model card

1_1e-2_1_0.1

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9333
  • Accuracy: 0.7315

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.01
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
1.34 1.0 590 0.8462 0.5199
1.1867 2.0 1180 0.6498 0.6220
0.9301 3.0 1770 1.2304 0.3780
0.9674 4.0 2360 1.3949 0.6217
1.0253 5.0 2950 0.6352 0.6502
0.8515 6.0 3540 1.6753 0.6217
0.7695 7.0 4130 1.0653 0.5021
0.737 8.0 4720 0.6902 0.6190
0.7016 9.0 5310 0.5830 0.7
0.6402 10.0 5900 0.5490 0.7037
0.6369 11.0 6490 0.8935 0.6615
0.581 12.0 7080 0.5859 0.7089
0.5689 13.0 7670 0.5938 0.7116
0.516 14.0 8260 0.5614 0.7168
0.4991 15.0 8850 0.7467 0.6609
0.4822 16.0 9440 0.5836 0.7214
0.4744 17.0 10030 0.7603 0.6905
0.4437 18.0 10620 0.8842 0.6459
0.401 19.0 11210 0.6236 0.7257
0.3914 20.0 11800 0.8274 0.7205
0.371 21.0 12390 1.2395 0.6945
0.3668 22.0 12980 0.7150 0.7122
0.3137 23.0 13570 0.7551 0.7150
0.2999 24.0 14160 0.7089 0.7067
0.3049 25.0 14750 0.7955 0.7275
0.3005 26.0 15340 0.7884 0.7187
0.2951 27.0 15930 0.8277 0.7070
0.2577 28.0 16520 0.7660 0.7281
0.252 29.0 17110 0.7648 0.7269
0.2531 30.0 17700 0.8062 0.7251
0.2241 31.0 18290 0.9123 0.7177
0.2428 32.0 18880 1.4634 0.7110
0.2425 33.0 19470 0.8689 0.7211
0.2068 34.0 20060 0.8337 0.7119
0.2063 35.0 20650 0.9671 0.7245
0.2091 36.0 21240 0.8245 0.7245
0.2006 37.0 21830 0.9072 0.7291
0.1872 38.0 22420 0.8780 0.7202
0.1887 39.0 23010 0.9743 0.7147
0.1929 40.0 23600 1.1905 0.7275
0.1801 41.0 24190 0.9523 0.7281
0.1644 42.0 24780 0.9279 0.7162
0.1711 43.0 25370 0.9404 0.7245
0.1566 44.0 25960 0.9386 0.7284
0.1598 45.0 26550 0.9960 0.7104
0.1555 46.0 27140 1.0066 0.7122
0.1522 47.0 27730 0.9795 0.7052
0.1542 48.0 28320 0.9479 0.7226
0.1616 49.0 28910 0.9216 0.7232
0.146 50.0 29500 1.0475 0.7330
0.1328 51.0 30090 0.9752 0.7098
0.1334 52.0 30680 1.0264 0.7110
0.142 53.0 31270 0.9470 0.7327
0.1326 54.0 31860 0.9134 0.7333
0.1367 55.0 32450 0.9496 0.7217
0.1392 56.0 33040 0.9867 0.7306
0.118 57.0 33630 1.0509 0.7309
0.1222 58.0 34220 0.9824 0.7165
0.1162 59.0 34810 1.0020 0.7327
0.1275 60.0 35400 1.0136 0.7327
0.1233 61.0 35990 0.9981 0.7309
0.1167 62.0 36580 0.9955 0.7119
0.1113 63.0 37170 0.9447 0.7217
0.113 64.0 37760 1.0350 0.7275
0.1062 65.0 38350 0.9102 0.7367
0.1118 66.0 38940 1.0759 0.7070
0.0979 67.0 39530 0.9346 0.7324
0.1121 68.0 40120 1.0193 0.7229
0.0966 69.0 40710 1.0026 0.7263
0.0998 70.0 41300 1.0442 0.7297
0.0998 71.0 41890 0.9181 0.7266
0.0965 72.0 42480 0.9982 0.7144
0.0952 73.0 43070 0.9347 0.7183
0.0973 74.0 43660 1.0005 0.7242
0.0895 75.0 44250 1.0202 0.7376
0.0856 76.0 44840 0.9652 0.7312
0.0917 77.0 45430 1.0078 0.7330
0.091 78.0 46020 0.9855 0.7327
0.093 79.0 46610 0.9786 0.7370
0.0849 80.0 47200 0.9529 0.7407
0.0813 81.0 47790 0.9586 0.7303
0.0877 82.0 48380 0.9472 0.7349
0.0813 83.0 48970 0.9310 0.7303
0.0835 84.0 49560 0.9795 0.7361
0.0821 85.0 50150 0.9592 0.7346
0.0777 86.0 50740 0.9667 0.7303
0.0755 87.0 51330 0.9616 0.7343
0.0753 88.0 51920 0.9413 0.7336
0.0753 89.0 52510 0.9925 0.7284
0.0694 90.0 53100 0.9715 0.7358
0.0751 91.0 53690 0.9424 0.7300
0.072 92.0 54280 0.9396 0.7294
0.0715 93.0 54870 0.9579 0.7352
0.0735 94.0 55460 0.9577 0.7349
0.0694 95.0 56050 0.9331 0.7315
0.0665 96.0 56640 0.9441 0.7343
0.0655 97.0 57230 0.9610 0.7346
0.0649 98.0 57820 0.9345 0.7318
0.0689 99.0 58410 0.9403 0.7330
0.0669 100.0 59000 0.9333 0.7315

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
2
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train Onutoa/1_1e-2_1_0.1