Edit model card

1_5e-3_1_0.9

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2534
  • Accuracy: 0.7483

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.005
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
0.959 1.0 590 0.8539 0.6217
0.7269 2.0 1180 0.5087 0.6211
0.7153 3.0 1770 0.6070 0.4086
0.6861 4.0 2360 0.7139 0.6217
0.6453 5.0 2950 0.4409 0.6584
0.5962 6.0 3540 0.4496 0.5887
0.5361 7.0 4130 0.5527 0.5122
0.5468 8.0 4720 0.3969 0.6850
0.4902 9.0 5310 0.3556 0.6878
0.4794 10.0 5900 0.4762 0.6657
0.4719 11.0 6490 0.3936 0.6450
0.4317 12.0 7080 0.3662 0.7037
0.4179 13.0 7670 0.3144 0.6884
0.3817 14.0 8260 0.3086 0.7061
0.3867 15.0 8850 0.3868 0.7131
0.3573 16.0 9440 0.3145 0.7156
0.3413 17.0 10030 0.3493 0.6667
0.3458 18.0 10620 0.3274 0.6758
0.3212 19.0 11210 0.2809 0.7211
0.3182 20.0 11800 0.3024 0.7294
0.2971 21.0 12390 0.2963 0.6991
0.297 22.0 12980 0.2757 0.7089
0.276 23.0 13570 0.2705 0.7245
0.2741 24.0 14160 0.2971 0.6924
0.2651 25.0 14750 0.3400 0.7327
0.2635 26.0 15340 0.3080 0.6859
0.2578 27.0 15930 0.2861 0.7083
0.2479 28.0 16520 0.2751 0.7398
0.2466 29.0 17110 0.2798 0.7385
0.2461 30.0 17700 0.2627 0.7266
0.2355 31.0 18290 0.3146 0.7309
0.2315 32.0 18880 0.4802 0.7159
0.2258 33.0 19470 0.2626 0.7327
0.2192 34.0 20060 0.2806 0.7385
0.2217 35.0 20650 0.2837 0.7040
0.2126 36.0 21240 0.2950 0.7434
0.21 37.0 21830 0.3081 0.7419
0.2086 38.0 22420 0.2490 0.7343
0.2071 39.0 23010 0.2674 0.7437
0.2052 40.0 23600 0.3063 0.7413
0.2027 41.0 24190 0.2926 0.7410
0.2035 42.0 24780 0.2712 0.7398
0.1945 43.0 25370 0.2639 0.7367
0.1988 44.0 25960 0.2570 0.7370
0.1909 45.0 26550 0.2635 0.7361
0.1891 46.0 27140 0.2565 0.7358
0.1878 47.0 27730 0.2588 0.7367
0.1861 48.0 28320 0.2511 0.7294
0.1932 49.0 28910 0.2632 0.7422
0.1835 50.0 29500 0.2599 0.7398
0.1803 51.0 30090 0.2641 0.7379
0.1808 52.0 30680 0.2586 0.7355
0.174 53.0 31270 0.2502 0.7394
0.1774 54.0 31860 0.2650 0.7361
0.1804 55.0 32450 0.2486 0.7330
0.1814 56.0 33040 0.2919 0.7422
0.1679 57.0 33630 0.2837 0.7398
0.1665 58.0 34220 0.2751 0.7391
0.1732 59.0 34810 0.2575 0.7315
0.1666 60.0 35400 0.2518 0.7349
0.1667 61.0 35990 0.2582 0.7407
0.172 62.0 36580 0.2512 0.7373
0.1657 63.0 37170 0.2500 0.7364
0.1687 64.0 37760 0.2589 0.7419
0.1605 65.0 38350 0.2833 0.7434
0.1635 66.0 38940 0.2536 0.7343
0.1583 67.0 39530 0.2554 0.7416
0.1638 68.0 40120 0.2598 0.7462
0.1615 69.0 40710 0.3022 0.7407
0.16 70.0 41300 0.2653 0.7459
0.1601 71.0 41890 0.2593 0.7456
0.1567 72.0 42480 0.2564 0.7446
0.1503 73.0 43070 0.2788 0.7465
0.1531 74.0 43660 0.2518 0.7446
0.1536 75.0 44250 0.3032 0.7440
0.1549 76.0 44840 0.2513 0.7370
0.1543 77.0 45430 0.2647 0.7486
0.1516 78.0 46020 0.2511 0.7471
0.1512 79.0 46610 0.2562 0.7431
0.1493 80.0 47200 0.2568 0.7474
0.1443 81.0 47790 0.2650 0.7492
0.1487 82.0 48380 0.2488 0.7492
0.1453 83.0 48970 0.2444 0.7431
0.1465 84.0 49560 0.2665 0.7443
0.1444 85.0 50150 0.2531 0.7456
0.1487 86.0 50740 0.2475 0.7431
0.1425 87.0 51330 0.2774 0.7453
0.145 88.0 51920 0.2636 0.7465
0.1399 89.0 52510 0.2552 0.7459
0.1429 90.0 53100 0.2611 0.7443
0.1453 91.0 53690 0.2558 0.7468
0.1473 92.0 54280 0.2467 0.7413
0.1433 93.0 54870 0.2712 0.7474
0.1445 94.0 55460 0.2591 0.7465
0.1432 95.0 56050 0.2604 0.7486
0.1397 96.0 56640 0.2618 0.7492
0.1412 97.0 57230 0.2550 0.7483
0.1327 98.0 57820 0.2512 0.7471
0.136 99.0 58410 0.2525 0.7489
0.145 100.0 59000 0.2534 0.7483

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
2
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train Onutoa/1_5e-3_1_0.9