Edit model card

20230824023615

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0725
  • Accuracy: 0.7365

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.003
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 312 0.6124 0.5271
0.3459 2.0 624 0.2937 0.4729
0.3459 3.0 936 0.4930 0.4693
0.2482 4.0 1248 0.1965 0.4693
0.2242 5.0 1560 0.2537 0.4693
0.2242 6.0 1872 0.1661 0.5632
0.2359 7.0 2184 0.1414 0.6570
0.2359 8.0 2496 0.1893 0.5018
0.2404 9.0 2808 0.1265 0.6173
0.2198 10.0 3120 0.1214 0.6679
0.2198 11.0 3432 0.1352 0.6029
0.1657 12.0 3744 0.1030 0.7040
0.1472 13.0 4056 0.1043 0.6931
0.1472 14.0 4368 0.1011 0.7004
0.1408 15.0 4680 0.1111 0.7148
0.1408 16.0 4992 0.1046 0.6931
0.1321 17.0 5304 0.0964 0.7004
0.1285 18.0 5616 0.1019 0.7220
0.1285 19.0 5928 0.0927 0.7256
0.1244 20.0 6240 0.0972 0.7004
0.1191 21.0 6552 0.0947 0.7076
0.1191 22.0 6864 0.0983 0.7184
0.1129 23.0 7176 0.1029 0.7040
0.1129 24.0 7488 0.0993 0.7112
0.1115 25.0 7800 0.0933 0.7076
0.1079 26.0 8112 0.1092 0.6931
0.1079 27.0 8424 0.0837 0.7437
0.105 28.0 8736 0.0825 0.7256
0.1049 29.0 9048 0.0809 0.7148
0.1049 30.0 9360 0.0924 0.7256
0.1021 31.0 9672 0.0820 0.7292
0.1021 32.0 9984 0.0793 0.7256
0.099 33.0 10296 0.0820 0.7365
0.0966 34.0 10608 0.0831 0.7184
0.0966 35.0 10920 0.0796 0.7256
0.0928 36.0 11232 0.0790 0.7292
0.0888 37.0 11544 0.0953 0.7256
0.0888 38.0 11856 0.0791 0.7437
0.0905 39.0 12168 0.0849 0.7473
0.0905 40.0 12480 0.0782 0.7401
0.0872 41.0 12792 0.0754 0.7292
0.0853 42.0 13104 0.0770 0.7365
0.0853 43.0 13416 0.0742 0.7473
0.0843 44.0 13728 0.0764 0.7220
0.0826 45.0 14040 0.0765 0.7256
0.0826 46.0 14352 0.0746 0.7365
0.0811 47.0 14664 0.0736 0.7292
0.0811 48.0 14976 0.0824 0.7292
0.079 49.0 15288 0.0749 0.7401
0.0783 50.0 15600 0.0734 0.7401
0.0783 51.0 15912 0.0740 0.7401
0.0806 52.0 16224 0.0749 0.7365
0.078 53.0 16536 0.0729 0.7365
0.078 54.0 16848 0.0728 0.7401
0.0764 55.0 17160 0.0722 0.7437
0.0764 56.0 17472 0.0745 0.7365
0.0766 57.0 17784 0.0730 0.7329
0.0751 58.0 18096 0.0725 0.7401
0.0751 59.0 18408 0.0730 0.7365
0.0765 60.0 18720 0.0725 0.7365

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
4

Dataset used to train dkqjrm/20230824023615