Edit model card

1_1e-2_5_0.5

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9066
  • Accuracy: 0.7523

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.01
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
3.4802 1.0 590 2.2869 0.6217
2.6082 2.0 1180 3.4236 0.3817
2.4745 3.0 1770 2.2889 0.4798
2.1207 4.0 2360 3.1025 0.4153
2.3212 5.0 2950 1.6078 0.6645
1.7706 6.0 3540 1.2848 0.6639
1.5762 7.0 4130 1.6810 0.5804
1.5772 8.0 4720 1.3710 0.6541
1.6406 9.0 5310 2.9248 0.6446
1.4258 10.0 5900 1.6009 0.6083
1.3469 11.0 6490 1.0047 0.7202
1.1999 12.0 7080 1.0492 0.7291
1.2865 13.0 7670 1.1196 0.6865
1.0782 14.0 8260 1.4270 0.7122
1.1171 15.0 8850 1.5357 0.7107
0.9956 16.0 9440 1.0942 0.6951
0.9941 17.0 10030 1.0744 0.7235
0.945 18.0 10620 0.9784 0.7367
0.8967 19.0 11210 1.1091 0.6951
0.8866 20.0 11800 0.9345 0.7306
0.8433 21.0 12390 1.1888 0.7346
0.8198 22.0 12980 0.9674 0.7291
0.792 23.0 13570 0.9667 0.7294
0.7303 24.0 14160 0.9543 0.7312
0.7379 25.0 14750 0.9951 0.7171
0.7053 26.0 15340 1.3433 0.6749
0.7032 27.0 15930 1.2760 0.6869
0.653 28.0 16520 0.9698 0.7251
0.6478 29.0 17110 1.0363 0.7446
0.6667 30.0 17700 0.9896 0.7407
0.6252 31.0 18290 1.0043 0.7431
0.6048 32.0 18880 0.9395 0.7440
0.6068 33.0 19470 0.9364 0.7394
0.5967 34.0 20060 0.9955 0.7232
0.5795 35.0 20650 0.9356 0.7446
0.5526 36.0 21240 0.9026 0.7468
0.5425 37.0 21830 0.9295 0.7471
0.5541 38.0 22420 0.9112 0.7437
0.5452 39.0 23010 0.9948 0.7508
0.4935 40.0 23600 1.0044 0.7459
0.5334 41.0 24190 1.3228 0.7440
0.5209 42.0 24780 0.9103 0.7416
0.505 43.0 25370 0.9525 0.7367
0.4895 44.0 25960 0.9358 0.7382
0.4887 45.0 26550 0.9364 0.7450
0.4704 46.0 27140 0.9413 0.7489
0.49 47.0 27730 1.0712 0.7529
0.4686 48.0 28320 0.9148 0.7434
0.4843 49.0 28910 0.9347 0.7498
0.482 50.0 29500 0.9344 0.7456
0.4599 51.0 30090 1.0064 0.7297
0.4581 52.0 30680 0.9590 0.7572
0.4575 53.0 31270 0.9898 0.7508
0.4317 54.0 31860 1.0466 0.7502
0.4529 55.0 32450 0.9734 0.7489
0.4406 56.0 33040 0.8741 0.7465
0.4243 57.0 33630 0.9795 0.7373
0.4211 58.0 34220 1.1723 0.7526
0.4433 59.0 34810 0.9914 0.7599
0.4316 60.0 35400 0.9280 0.7431
0.4085 61.0 35990 0.8985 0.7456
0.4192 62.0 36580 0.9011 0.7419
0.4155 63.0 37170 0.9365 0.7486
0.4192 64.0 37760 0.9214 0.7428
0.4093 65.0 38350 0.8889 0.7440
0.393 66.0 38940 1.0415 0.7517
0.3994 67.0 39530 0.8926 0.7398
0.3993 68.0 40120 0.8709 0.7437
0.3902 69.0 40710 0.9109 0.7456
0.412 70.0 41300 1.0131 0.7550
0.4024 71.0 41890 0.8784 0.7520
0.3933 72.0 42480 1.0307 0.7517
0.3723 73.0 43070 0.9278 0.7514
0.372 74.0 43660 1.0847 0.7581
0.38 75.0 44250 0.8899 0.7450
0.3837 76.0 44840 0.9025 0.7450
0.3701 77.0 45430 0.9842 0.7538
0.3796 78.0 46020 0.9811 0.7529
0.3648 79.0 46610 0.9299 0.7520
0.3669 80.0 47200 0.9235 0.7514
0.3429 81.0 47790 0.8995 0.7437
0.3619 82.0 48380 1.0140 0.7523
0.3597 83.0 48970 0.8973 0.7471
0.3484 84.0 49560 0.9573 0.7538
0.3526 85.0 50150 0.8922 0.7477
0.3442 86.0 50740 0.9773 0.7541
0.3385 87.0 51330 1.0145 0.7511
0.3406 88.0 51920 0.9287 0.7514
0.3514 89.0 52510 1.0034 0.7532
0.3352 90.0 53100 0.9218 0.7514
0.341 91.0 53690 0.9334 0.7514
0.3441 92.0 54280 0.8896 0.7459
0.3415 93.0 54870 0.9309 0.7517
0.3391 94.0 55460 0.9353 0.7514
0.3369 95.0 56050 0.9214 0.7523
0.3293 96.0 56640 0.9147 0.7523
0.3377 97.0 57230 0.9083 0.7505
0.3211 98.0 57820 0.8986 0.7517
0.3216 99.0 58410 0.9027 0.7529
0.3276 100.0 59000 0.9066 0.7523

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
10

Dataset used to train Onutoa/1_1e-2_5_0.5