Edit model card

20230822011246

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 12.0925
  • Accuracy: 0.4729

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.01
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 312 23.5855 0.5271
27.3295 2.0 624 15.7672 0.4729
27.3295 3.0 936 14.1816 0.5271
19.6736 4.0 1248 13.5811 0.4729
18.8481 5.0 1560 13.3851 0.4729
18.8481 6.0 1872 13.0199 0.4729
18.5899 7.0 2184 12.9497 0.4838
18.5899 8.0 2496 12.9961 0.4729
18.473 9.0 2808 12.8275 0.4729
18.3073 10.0 3120 12.6992 0.4729
18.3073 11.0 3432 13.5160 0.5271
18.2739 12.0 3744 12.6731 0.5307
18.1236 13.0 4056 12.6066 0.4729
18.1236 14.0 4368 12.5802 0.4729
18.1096 15.0 4680 12.6447 0.5271
18.1096 16.0 4992 13.3094 0.4729
18.1134 17.0 5304 13.0970 0.5271
18.1098 18.0 5616 12.7293 0.5271
18.1098 19.0 5928 12.6166 0.5271
18.0277 20.0 6240 12.5606 0.4729
18.0289 21.0 6552 12.5322 0.4729
18.0289 22.0 6864 12.7341 0.5271
18.0223 23.0 7176 12.5497 0.4729
18.0223 24.0 7488 12.4199 0.5271
17.9317 25.0 7800 12.7868 0.5271
17.9107 26.0 8112 12.3295 0.4729
17.9107 27.0 8424 12.6038 0.4729
17.8944 28.0 8736 12.3329 0.5271
17.8667 29.0 9048 12.3034 0.5271
17.8667 30.0 9360 12.4605 0.5271
17.8228 31.0 9672 12.5110 0.4729
17.8228 32.0 9984 12.4227 0.5271
17.8006 33.0 10296 12.2972 0.4729
17.76 34.0 10608 12.3011 0.4729
17.76 35.0 10920 12.2179 0.4729
17.7564 36.0 11232 12.2381 0.4729
17.7084 37.0 11544 12.8747 0.4729
17.7084 38.0 11856 12.1945 0.4729
17.7035 39.0 12168 12.2180 0.4729
17.7035 40.0 12480 12.2830 0.4729
17.6668 41.0 12792 12.1857 0.4693
17.6396 42.0 13104 12.2239 0.5379
17.6396 43.0 13416 12.1584 0.5271
17.6452 44.0 13728 12.3185 0.4729
17.6074 45.0 14040 12.2421 0.5271
17.6074 46.0 14352 12.1912 0.4729
17.6167 47.0 14664 12.2022 0.5271
17.6167 48.0 14976 12.1326 0.4729
17.5782 49.0 15288 12.1550 0.4729
17.562 50.0 15600 12.2250 0.5271
17.562 51.0 15912 12.1190 0.4729
17.5409 52.0 16224 12.1505 0.5271
17.5211 53.0 16536 12.1046 0.4729
17.5211 54.0 16848 12.1132 0.5271
17.5043 55.0 17160 12.1159 0.4729
17.5043 56.0 17472 12.1085 0.5271
17.4952 57.0 17784 12.1024 0.4729
17.4731 58.0 18096 12.0955 0.4729
17.4731 59.0 18408 12.0981 0.5271
17.4654 60.0 18720 12.0925 0.4729

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
2
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train Onutoa/20230822011246