Edit model card

1_9e-3_1_0.9

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2451
  • Accuracy: 0.7456

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.009
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
1.4019 1.0 590 0.7035 0.3902
0.8201 2.0 1180 0.6123 0.6205
0.8196 3.0 1770 0.5319 0.4587
0.8006 4.0 2360 0.8985 0.4006
0.6419 5.0 2950 0.4274 0.6352
0.5396 6.0 3540 0.4366 0.6015
0.5199 7.0 4130 0.4763 0.5951
0.4784 8.0 4720 0.3445 0.6890
0.442 9.0 5310 0.3407 0.6979
0.4292 10.0 5900 0.5351 0.6725
0.408 11.0 6490 0.3183 0.7110
0.3802 12.0 7080 0.4297 0.6199
0.3826 13.0 7670 0.3199 0.6807
0.3608 14.0 8260 0.2910 0.7214
0.3485 15.0 8850 0.3491 0.7220
0.3481 16.0 9440 0.3000 0.7223
0.3326 17.0 10030 0.2859 0.7147
0.328 18.0 10620 0.3148 0.6859
0.315 19.0 11210 0.2922 0.7352
0.3123 20.0 11800 0.3455 0.7358
0.2956 21.0 12390 0.2757 0.7211
0.2983 22.0 12980 0.2832 0.7278
0.2798 23.0 13570 0.2676 0.7260
0.2765 24.0 14160 0.2925 0.7021
0.2761 25.0 14750 0.3859 0.7336
0.2646 26.0 15340 0.3827 0.6517
0.2608 27.0 15930 0.2672 0.7284
0.2474 28.0 16520 0.2737 0.7456
0.2487 29.0 17110 0.2770 0.7468
0.2445 30.0 17700 0.2732 0.7174
0.2455 31.0 18290 0.2902 0.7410
0.236 32.0 18880 0.3913 0.7352
0.228 33.0 19470 0.2819 0.7410
0.2182 34.0 20060 0.2863 0.7453
0.2203 35.0 20650 0.3294 0.6988
0.2189 36.0 21240 0.2809 0.7398
0.2132 37.0 21830 0.2631 0.7413
0.2082 38.0 22420 0.2600 0.7315
0.2144 39.0 23010 0.2841 0.7489
0.2128 40.0 23600 0.2650 0.7321
0.1978 41.0 24190 0.2795 0.7440
0.2044 42.0 24780 0.2650 0.7349
0.1936 43.0 25370 0.2666 0.7385
0.1967 44.0 25960 0.2861 0.7440
0.1823 45.0 26550 0.2658 0.7358
0.1868 46.0 27140 0.2834 0.7477
0.1914 47.0 27730 0.2882 0.7407
0.1962 48.0 28320 0.2547 0.7425
0.1916 49.0 28910 0.2586 0.7407
0.181 50.0 29500 0.2629 0.7413
0.1817 51.0 30090 0.2637 0.7373
0.1795 52.0 30680 0.2719 0.7422
0.1682 53.0 31270 0.2583 0.7483
0.1736 54.0 31860 0.2547 0.7327
0.17 55.0 32450 0.2580 0.7349
0.1775 56.0 33040 0.2583 0.7459
0.1678 57.0 33630 0.2681 0.7431
0.1648 58.0 34220 0.2652 0.7404
0.1679 59.0 34810 0.2602 0.7333
0.1637 60.0 35400 0.2563 0.7407
0.1627 61.0 35990 0.2611 0.7416
0.1678 62.0 36580 0.2558 0.7425
0.1588 63.0 37170 0.2578 0.7275
0.1626 64.0 37760 0.2602 0.7352
0.164 65.0 38350 0.2562 0.7446
0.1584 66.0 38940 0.2556 0.7367
0.1514 67.0 39530 0.2784 0.7437
0.1567 68.0 40120 0.2643 0.7446
0.1528 69.0 40710 0.2715 0.7456
0.1593 70.0 41300 0.2611 0.7505
0.1578 71.0 41890 0.2539 0.7388
0.1515 72.0 42480 0.2850 0.7526
0.1485 73.0 43070 0.2831 0.7505
0.1509 74.0 43660 0.2723 0.7486
0.1532 75.0 44250 0.3408 0.7446
0.1512 76.0 44840 0.2522 0.7419
0.1463 77.0 45430 0.2491 0.7422
0.1481 78.0 46020 0.2477 0.7443
0.1467 79.0 46610 0.2524 0.7401
0.1457 80.0 47200 0.2688 0.7505
0.1393 81.0 47790 0.2564 0.7422
0.1454 82.0 48380 0.2520 0.7465
0.1409 83.0 48970 0.2517 0.7425
0.1395 84.0 49560 0.2479 0.7453
0.1382 85.0 50150 0.2524 0.7520
0.1394 86.0 50740 0.2546 0.7520
0.1353 87.0 51330 0.2693 0.7526
0.1343 88.0 51920 0.2503 0.7483
0.1337 89.0 52510 0.2480 0.7486
0.1357 90.0 53100 0.2605 0.7517
0.1365 91.0 53690 0.2481 0.7477
0.1359 92.0 54280 0.2440 0.7446
0.1354 93.0 54870 0.2572 0.7535
0.1357 94.0 55460 0.2521 0.7471
0.1329 95.0 56050 0.2558 0.7514
0.1325 96.0 56640 0.2475 0.7450
0.131 97.0 57230 0.2446 0.7446
0.1275 98.0 57820 0.2446 0.7434
0.1244 99.0 58410 0.2437 0.7425
0.1357 100.0 59000 0.2451 0.7456

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
10

Dataset used to train Onutoa/1_9e-3_1_0.9