Edit model card

1_9e-3_10_0.5

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8950
  • Accuracy: 0.7544

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.009
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
3.4984 1.0 590 2.4126 0.4138
2.7588 2.0 1180 3.3455 0.6217
2.6474 3.0 1770 2.9962 0.6232
2.5401 4.0 2360 2.3051 0.6171
2.4721 5.0 2950 1.9595 0.5939
2.0867 6.0 3540 1.8172 0.6780
1.9063 7.0 4130 1.6558 0.6281
1.7809 8.0 4720 1.4817 0.7009
1.6707 9.0 5310 1.3547 0.7012
1.5201 10.0 5900 1.7296 0.6813
1.4583 11.0 6490 1.4437 0.7107
1.2224 12.0 7080 1.0975 0.7168
1.202 13.0 7670 1.1965 0.6960
1.1209 14.0 8260 1.0158 0.7287
1.093 15.0 8850 1.0843 0.7413
1.0345 16.0 9440 1.0067 0.7410
0.9682 17.0 10030 1.0831 0.7382
0.9253 18.0 10620 1.0804 0.6985
0.8916 19.0 11210 0.9652 0.7321
0.869 20.0 11800 1.0800 0.7388
0.8274 21.0 12390 0.9542 0.7477
0.8245 22.0 12980 0.9244 0.7379
0.8061 23.0 13570 0.9591 0.7388
0.7804 24.0 14160 0.9471 0.7248
0.7538 25.0 14750 1.0707 0.7440
0.7006 26.0 15340 1.1137 0.7046
0.7006 27.0 15930 1.0523 0.7199
0.6916 28.0 16520 1.1686 0.7358
0.6904 29.0 17110 0.9189 0.7306
0.6631 30.0 17700 0.9745 0.7235
0.6323 31.0 18290 1.0716 0.7422
0.632 32.0 18880 1.0115 0.7465
0.6287 33.0 19470 0.9677 0.7217
0.6044 34.0 20060 0.9363 0.7419
0.5925 35.0 20650 0.9818 0.7297
0.5741 36.0 21240 1.3325 0.7361
0.5764 37.0 21830 0.9738 0.7471
0.5714 38.0 22420 0.9287 0.7480
0.5845 39.0 23010 1.2000 0.7450
0.5831 40.0 23600 0.9481 0.7462
0.5602 41.0 24190 1.0474 0.7440
0.5576 42.0 24780 0.9117 0.7434
0.5419 43.0 25370 0.9980 0.7495
0.5274 44.0 25960 0.9491 0.7462
0.5363 45.0 26550 0.9289 0.7468
0.5117 46.0 27140 0.9171 0.7498
0.5228 47.0 27730 0.9486 0.7535
0.4971 48.0 28320 0.9034 0.7498
0.5325 49.0 28910 0.9143 0.7495
0.4951 50.0 29500 0.9264 0.7373
0.5123 51.0 30090 0.9220 0.7471
0.52 52.0 30680 0.9128 0.7502
0.4884 53.0 31270 0.9343 0.7443
0.4878 54.0 31860 1.0356 0.7422
0.4967 55.0 32450 0.9414 0.7486
0.4823 56.0 33040 0.9257 0.7480
0.4789 57.0 33630 0.9966 0.7511
0.463 58.0 34220 0.9716 0.7495
0.4662 59.0 34810 0.9143 0.7425
0.4663 60.0 35400 0.9606 0.7526
0.4525 61.0 35990 0.9436 0.7541
0.4683 62.0 36580 0.9041 0.7480
0.4435 63.0 37170 0.9220 0.7443
0.4537 64.0 37760 0.9282 0.7394
0.4582 65.0 38350 0.9121 0.7398
0.4486 66.0 38940 0.9371 0.7477
0.4437 67.0 39530 0.9325 0.7431
0.4309 68.0 40120 0.9278 0.7492
0.4314 69.0 40710 1.0054 0.7489
0.4468 70.0 41300 0.9489 0.7532
0.4403 71.0 41890 0.9136 0.7446
0.4311 72.0 42480 0.9262 0.7489
0.4157 73.0 43070 0.9242 0.7511
0.4257 74.0 43660 0.9193 0.7508
0.4384 75.0 44250 1.0083 0.7566
0.4348 76.0 44840 0.9168 0.7382
0.4248 77.0 45430 0.9467 0.7550
0.4337 78.0 46020 0.9040 0.7477
0.4222 79.0 46610 0.9154 0.7526
0.4155 80.0 47200 0.9295 0.7508
0.4005 81.0 47790 0.9250 0.7505
0.4189 82.0 48380 1.0215 0.7560
0.4098 83.0 48970 0.9297 0.7557
0.4001 84.0 49560 0.9180 0.7547
0.408 85.0 50150 0.9020 0.7526
0.3984 86.0 50740 0.9035 0.7532
0.4118 87.0 51330 0.9474 0.7532
0.4011 88.0 51920 0.9016 0.7471
0.3934 89.0 52510 0.9447 0.7557
0.3867 90.0 53100 0.8979 0.7517
0.3999 91.0 53690 0.9156 0.7541
0.4011 92.0 54280 0.9033 0.7508
0.3826 93.0 54870 0.9343 0.7526
0.3912 94.0 55460 0.9071 0.7532
0.3803 95.0 56050 0.9248 0.7572
0.3842 96.0 56640 0.8987 0.7535
0.3846 97.0 57230 0.9031 0.7550
0.3743 98.0 57820 0.8933 0.7544
0.3782 99.0 58410 0.8965 0.7532
0.3939 100.0 59000 0.8950 0.7544

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
10

Dataset used to train Onutoa/1_9e-3_10_0.5