Edit model card

20230824083011

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3090
  • Accuracy: 0.7401

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.003
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
0.7501 1.0 623 0.9859 0.4729
0.6252 2.0 1246 0.4891 0.4801
0.5769 3.0 1869 1.1271 0.4729
0.5672 4.0 2492 0.4257 0.5632
0.5439 5.0 3115 0.5883 0.5415
0.5426 6.0 3738 0.3734 0.6245
0.61 7.0 4361 0.4410 0.5848
0.4937 8.0 4984 0.4091 0.5632
0.4293 9.0 5607 0.3712 0.6282
0.3897 10.0 6230 0.3441 0.6931
0.3759 11.0 6853 0.3400 0.7004
0.379 12.0 7476 0.3802 0.6787
0.3661 13.0 8099 0.3456 0.7184
0.374 14.0 8722 0.3545 0.6859
0.3441 15.0 9345 0.3219 0.7112
0.3339 16.0 9968 0.3192 0.7184
0.3324 17.0 10591 0.3290 0.7184
0.324 18.0 11214 0.3284 0.7112
0.3641 19.0 11837 0.3100 0.7292
0.3138 20.0 12460 0.3102 0.7365
0.3099 21.0 13083 0.3887 0.7076
0.3095 22.0 13706 0.3443 0.7004
0.3039 23.0 14329 0.3937 0.6895
0.287 24.0 14952 0.3071 0.7473
0.2718 25.0 15575 0.3097 0.7184
0.2711 26.0 16198 0.2888 0.7329
0.2738 27.0 16821 0.2920 0.7220
0.2697 28.0 17444 0.2986 0.7329
0.2589 29.0 18067 0.3092 0.7437
0.2536 30.0 18690 0.3141 0.7292
0.2564 31.0 19313 0.3134 0.7401
0.2493 32.0 19936 0.2962 0.7365
0.2428 33.0 20559 0.3358 0.7256
0.2425 34.0 21182 0.3155 0.7148
0.2342 35.0 21805 0.3000 0.7220
0.2394 36.0 22428 0.2955 0.7329
0.2257 37.0 23051 0.3070 0.7509
0.2272 38.0 23674 0.2959 0.7365
0.2197 39.0 24297 0.3100 0.7401
0.2144 40.0 24920 0.3009 0.7365
0.2164 41.0 25543 0.2957 0.7256
0.2129 42.0 26166 0.3133 0.7292
0.2106 43.0 26789 0.3110 0.7329
0.2069 44.0 27412 0.3072 0.7329
0.2051 45.0 28035 0.3300 0.7292
0.2064 46.0 28658 0.3106 0.7256
0.2039 47.0 29281 0.3114 0.7292
0.2106 48.0 29904 0.3180 0.7365
0.2008 49.0 30527 0.3099 0.7329
0.1945 50.0 31150 0.3066 0.7329
0.1958 51.0 31773 0.3124 0.7401
0.1939 52.0 32396 0.3230 0.7401
0.1942 53.0 33019 0.3105 0.7365
0.1887 54.0 33642 0.3014 0.7256
0.185 55.0 34265 0.3052 0.7365
0.1868 56.0 34888 0.3155 0.7365
0.1888 57.0 35511 0.3056 0.7256
0.1885 58.0 36134 0.3069 0.7329
0.192 59.0 36757 0.3076 0.7329
0.1807 60.0 37380 0.3090 0.7401

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
10

Dataset used to train dkqjrm/20230824083011