Edit model card

20230822011123

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 12.7559
  • Accuracy: 0.4729

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 312 33.1111 0.4693
33.5632 2.0 624 29.4330 0.4729
33.5632 3.0 936 28.6575 0.4729
29.5796 4.0 1248 27.5594 0.4946
27.7947 5.0 1560 24.0011 0.4729
27.7947 6.0 1872 21.8497 0.5307
24.4291 7.0 2184 18.9382 0.5271
24.4291 8.0 2496 17.0228 0.5271
21.7331 9.0 2808 16.2191 0.5271
20.2434 10.0 3120 15.6640 0.5271
20.2434 11.0 3432 15.3209 0.4729
19.5791 12.0 3744 15.0367 0.4729
19.1759 13.0 4056 14.7859 0.4729
19.1759 14.0 4368 14.5689 0.4729
18.9129 15.0 4680 14.4199 0.4729
18.9129 16.0 4992 14.3070 0.5271
18.725 17.0 5304 14.2007 0.5271
18.5733 18.0 5616 14.0996 0.4729
18.5733 19.0 5928 14.0560 0.4729
18.4591 20.0 6240 13.9476 0.5271
18.3533 21.0 6552 13.8532 0.5271
18.3533 22.0 6864 13.8091 0.5271
18.2596 23.0 7176 13.7278 0.5271
18.2596 24.0 7488 13.6616 0.4729
18.1857 25.0 7800 13.5820 0.4729
18.1091 26.0 8112 13.5658 0.4729
18.1091 27.0 8424 13.4950 0.4729
18.0388 28.0 8736 13.4109 0.4729
17.9676 29.0 9048 13.3571 0.4729
17.9676 30.0 9360 13.3096 0.4729
17.9109 31.0 9672 13.2689 0.5271
17.9109 32.0 9984 13.2199 0.4729
17.8555 33.0 10296 13.1702 0.5271
17.7959 34.0 10608 13.1315 0.4729
17.7959 35.0 10920 13.0977 0.5271
17.7567 36.0 11232 13.0718 0.4729
17.718 37.0 11544 13.0244 0.4729
17.718 38.0 11856 13.0061 0.5271
17.6743 39.0 12168 12.9777 0.5271
17.6743 40.0 12480 12.9545 0.4729
17.6411 41.0 12792 12.9362 0.4729
17.6197 42.0 13104 12.9564 0.4729
17.6197 43.0 13416 12.8934 0.4729
17.598 44.0 13728 12.8824 0.4729
17.5669 45.0 14040 12.8925 0.4729
17.5669 46.0 14352 12.8567 0.4729
17.5513 47.0 14664 12.8525 0.4729
17.5513 48.0 14976 12.8268 0.5271
17.5412 49.0 15288 12.8100 0.4729
17.5282 50.0 15600 12.8056 0.4729
17.5282 51.0 15912 12.7995 0.4729
17.51 52.0 16224 12.7996 0.4729
17.5032 53.0 16536 12.7793 0.4729
17.5032 54.0 16848 12.7732 0.4729
17.4893 55.0 17160 12.7682 0.4729
17.4893 56.0 17472 12.7625 0.4729
17.4874 57.0 17784 12.7641 0.4729
17.4805 58.0 18096 12.7570 0.4729
17.4805 59.0 18408 12.7564 0.4729
17.4784 60.0 18720 12.7559 0.4729

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
2
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train Onutoa/20230822011123