Edit model card

20230821213736

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 22.0684
  • Accuracy: 0.4801

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 312 34.4593 0.4729
34.6035 2.0 624 34.1903 0.4729
34.6035 3.0 936 33.9397 0.5343
34.2607 4.0 1248 33.6773 0.5343
33.8346 5.0 1560 33.3601 0.4729
33.8346 6.0 1872 32.9334 0.5235
33.2988 7.0 2184 32.4093 0.5451
33.2988 8.0 2496 31.6614 0.5343
32.523 9.0 2808 31.1242 0.5487
31.6421 10.0 3120 30.7433 0.5271
31.6421 11.0 3432 30.4265 0.4910
30.9414 12.0 3744 30.1340 0.4729
30.3998 13.0 4056 29.6940 0.4729
30.3998 14.0 4368 29.2574 0.4838
29.7765 15.0 4680 28.9204 0.4729
29.7765 16.0 4992 28.7916 0.4729
29.2672 17.0 5304 28.7245 0.5379
29.0545 18.0 5616 28.6656 0.4729
29.0545 19.0 5928 28.6131 0.4729
28.9469 20.0 6240 28.5471 0.5126
28.8473 21.0 6552 28.4760 0.5343
28.8473 22.0 6864 28.3978 0.4765
28.7322 23.0 7176 28.3073 0.5271
28.7322 24.0 7488 28.1897 0.4729
28.5992 25.0 7800 28.0411 0.4729
28.4123 26.0 8112 27.8587 0.4729
28.4123 27.0 8424 27.6169 0.4729
28.1552 28.0 8736 27.2253 0.5018
27.7135 29.0 9048 26.7643 0.4729
27.7135 30.0 9360 26.2981 0.4693
27.1493 31.0 9672 25.9554 0.4874
27.1493 32.0 9984 25.6574 0.5018
26.68 33.0 10296 25.3846 0.4729
26.3235 34.0 10608 25.0976 0.4729
26.3235 35.0 10920 24.8303 0.4874
25.9833 36.0 11232 24.5811 0.4729
25.6663 37.0 11544 24.3341 0.4874
25.6663 38.0 11856 24.1074 0.4729
25.3808 39.0 12168 23.9099 0.4874
25.3808 40.0 12480 23.7138 0.5343
25.12 41.0 12792 23.5439 0.4874
24.8956 42.0 13104 23.3745 0.4729
24.8956 43.0 13416 23.2148 0.5162
24.6833 44.0 13728 23.0665 0.4765
24.498 45.0 14040 22.9456 0.4729
24.498 46.0 14352 22.8208 0.4729
24.3449 47.0 14664 22.7087 0.4693
24.3449 48.0 14976 22.6159 0.4910
24.1996 49.0 15288 22.5243 0.4874
24.0892 50.0 15600 22.4457 0.4801
24.0892 51.0 15912 22.3728 0.4838
23.9876 52.0 16224 22.3081 0.4874
23.9068 53.0 16536 22.2526 0.4729
23.9068 54.0 16848 22.2029 0.4801
23.837 55.0 17160 22.1624 0.4874
23.837 56.0 17472 22.1289 0.4765
23.7911 57.0 17784 22.1029 0.4729
23.7521 58.0 18096 22.0854 0.4729
23.7521 59.0 18408 22.0726 0.4765
23.7328 60.0 18720 22.0684 0.4801

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
1

Dataset used to train Onutoa/20230821213736