Edit model card

20230822120608

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 19.9899
  • Accuracy: 0.5271

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.01
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 312 27.0766 0.5271
29.447 2.0 624 24.0887 0.4729
29.447 3.0 936 23.9640 0.5271
27.7172 4.0 1248 22.0260 0.4729
26.4345 5.0 1560 22.0502 0.4729
26.4345 6.0 1872 22.9337 0.5271
27.0832 7.0 2184 21.2859 0.5271
27.0832 8.0 2496 21.4709 0.4729
25.6523 9.0 2808 20.3539 0.5271
25.5288 10.0 3120 21.2982 0.5271
25.5288 11.0 3432 22.0599 0.5271
25.9846 12.0 3744 22.1000 0.5271
26.609 13.0 4056 24.1133 0.4729
26.609 14.0 4368 22.4392 0.4729
26.7751 15.0 4680 22.0514 0.4729
26.7751 16.0 4992 21.4413 0.5271
25.8484 17.0 5304 21.6759 0.5271
25.7937 18.0 5616 21.2726 0.5271
25.7937 19.0 5928 21.2489 0.5271
25.6479 20.0 6240 21.1881 0.5271
25.6144 21.0 6552 21.0354 0.5271
25.6144 22.0 6864 21.0688 0.4729
25.4368 23.0 7176 21.2154 0.4729
25.4368 24.0 7488 21.2348 0.4729
25.5564 25.0 7800 21.1510 0.5271
25.5495 26.0 8112 21.3992 0.5271
25.5495 27.0 8424 21.4035 0.4729
25.4536 28.0 8736 20.9643 0.5271
25.3641 29.0 9048 20.7780 0.4729
25.3641 30.0 9360 21.4761 0.5271
25.4089 31.0 9672 21.1053 0.4729
25.4089 32.0 9984 21.1557 0.5271
25.6056 33.0 10296 21.0180 0.5271
25.5078 34.0 10608 21.1026 0.4729
25.5078 35.0 10920 21.3723 0.4729
25.6607 36.0 11232 21.4309 0.4729
25.9641 37.0 11544 21.4083 0.5271
25.9641 38.0 11856 21.2875 0.5271
25.6756 39.0 12168 21.4538 0.5271
25.6756 40.0 12480 21.1870 0.4729
25.4709 41.0 12792 21.0796 0.5271
25.2913 42.0 13104 20.9412 0.5271
25.2913 43.0 13416 20.8932 0.5271
25.1541 44.0 13728 20.9172 0.4729
25.0679 45.0 14040 20.6787 0.5271
25.0679 46.0 14352 20.6308 0.4729
24.965 47.0 14664 20.5240 0.5271
24.965 48.0 14976 20.6378 0.4729
24.8969 49.0 15288 20.5030 0.4729
24.8319 50.0 15600 20.3257 0.5271
24.8319 51.0 15912 20.2990 0.5271
24.7301 52.0 16224 20.3661 0.4729
24.6644 53.0 16536 20.2088 0.5271
24.6644 54.0 16848 20.1543 0.5271
24.5917 55.0 17160 20.0860 0.4729
24.5917 56.0 17472 20.0672 0.5271
24.5505 57.0 17784 20.0518 0.5271
24.5065 58.0 18096 20.0036 0.5271
24.5065 59.0 18408 19.9939 0.5271
24.4773 60.0 18720 19.9899 0.5271

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
5

Dataset used to train Onutoa/20230822120608