Edit model card

20230825070638

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3456
  • Accuracy: 0.7329

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.005
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 80.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 156 0.7894 0.5271
No log 2.0 312 0.6658 0.5379
No log 3.0 468 0.6408 0.5054
0.886 4.0 624 0.7134 0.4729
0.886 5.0 780 0.6234 0.5560
0.886 6.0 936 0.4782 0.6318
0.7765 7.0 1092 1.1394 0.5776
0.7765 8.0 1248 0.5214 0.6534
0.7765 9.0 1404 0.4206 0.6570
0.7206 10.0 1560 0.5019 0.6643
0.7206 11.0 1716 0.7680 0.5343
0.7206 12.0 1872 0.3433 0.7220
0.6543 13.0 2028 0.3834 0.7292
0.6543 14.0 2184 0.4588 0.6751
0.6543 15.0 2340 0.3413 0.7040
0.6543 16.0 2496 0.4874 0.6426
0.5973 17.0 2652 0.3283 0.7256
0.5973 18.0 2808 0.3605 0.7329
0.5973 19.0 2964 0.3314 0.7256
0.5433 20.0 3120 0.5998 0.6606
0.5433 21.0 3276 0.3489 0.6931
0.5433 22.0 3432 0.4316 0.6715
0.5373 23.0 3588 0.3328 0.7076
0.5373 24.0 3744 0.3379 0.7220
0.5373 25.0 3900 0.3580 0.7148
0.4923 26.0 4056 0.3141 0.7329
0.4923 27.0 4212 0.4341 0.7365
0.4923 28.0 4368 0.3386 0.7220
0.4513 29.0 4524 0.3038 0.7220
0.4513 30.0 4680 0.3775 0.7220
0.4513 31.0 4836 0.4197 0.7076
0.4513 32.0 4992 0.4666 0.7220
0.4041 33.0 5148 0.3355 0.7365
0.4041 34.0 5304 0.3147 0.7329
0.4041 35.0 5460 0.3810 0.7184
0.3705 36.0 5616 0.3184 0.7256
0.3705 37.0 5772 0.3668 0.7076
0.3705 38.0 5928 0.3859 0.7220
0.3556 39.0 6084 0.3010 0.7329
0.3556 40.0 6240 0.3201 0.7220
0.3556 41.0 6396 0.3304 0.7329
0.3089 42.0 6552 0.3634 0.7365
0.3089 43.0 6708 0.3844 0.7184
0.3089 44.0 6864 0.3320 0.7220
0.3015 45.0 7020 0.3696 0.7220
0.3015 46.0 7176 0.3665 0.7220
0.3015 47.0 7332 0.3355 0.7256
0.3015 48.0 7488 0.3568 0.7292
0.2709 49.0 7644 0.3450 0.7329
0.2709 50.0 7800 0.3790 0.7148
0.2709 51.0 7956 0.3516 0.7112
0.2681 52.0 8112 0.3741 0.7329
0.2681 53.0 8268 0.3615 0.7220
0.2681 54.0 8424 0.3479 0.7292
0.2477 55.0 8580 0.3401 0.7184
0.2477 56.0 8736 0.3766 0.7329
0.2477 57.0 8892 0.3562 0.7148
0.2344 58.0 9048 0.3412 0.7220
0.2344 59.0 9204 0.3782 0.7437
0.2344 60.0 9360 0.3723 0.7040
0.2126 61.0 9516 0.3852 0.7292
0.2126 62.0 9672 0.3901 0.7256
0.2126 63.0 9828 0.3698 0.7112
0.2126 64.0 9984 0.3249 0.7220
0.2127 65.0 10140 0.3979 0.7004
0.2127 66.0 10296 0.3705 0.7365
0.2127 67.0 10452 0.3317 0.7220
0.199 68.0 10608 0.3322 0.7329
0.199 69.0 10764 0.3706 0.7220
0.199 70.0 10920 0.3628 0.7148
0.1959 71.0 11076 0.3600 0.7437
0.1959 72.0 11232 0.3349 0.7437
0.1959 73.0 11388 0.3650 0.7184
0.184 74.0 11544 0.3337 0.7365
0.184 75.0 11700 0.3309 0.7329
0.184 76.0 11856 0.3237 0.7365
0.183 77.0 12012 0.3430 0.7256
0.183 78.0 12168 0.3567 0.7329
0.183 79.0 12324 0.3541 0.7329
0.183 80.0 12480 0.3456 0.7329

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
7
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train dkqjrm/20230825070638