Edit model card

20230824002458

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0768
  • Accuracy: 0.7112

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.003
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
0.4042 1.0 623 0.3862 0.5271
0.3203 2.0 1246 1.0958 0.4729
0.3087 3.0 1869 0.5979 0.4729
0.2723 4.0 2492 0.1618 0.5271
0.2635 5.0 3115 0.2704 0.5343
0.2826 6.0 3738 0.3245 0.4729
0.2663 7.0 4361 0.2230 0.5957
0.2562 8.0 4984 0.1453 0.6390
0.2259 9.0 5607 0.1312 0.6282
0.1806 10.0 6230 0.1118 0.7148
0.1525 11.0 6853 0.1076 0.6787
0.1509 12.0 7476 0.1241 0.6643
0.149 13.0 8099 0.1158 0.6931
0.1509 14.0 8722 0.1154 0.7040
0.1397 15.0 9345 0.1096 0.6823
0.1311 16.0 9968 0.0999 0.6751
0.13 17.0 10591 0.0986 0.6968
0.1244 18.0 11214 0.1063 0.6895
0.1278 19.0 11837 0.1229 0.6931
0.1228 20.0 12460 0.0905 0.7112
0.1153 21.0 13083 0.0916 0.7004
0.1171 22.0 13706 0.1085 0.7148
0.1179 23.0 14329 0.1101 0.7256
0.1069 24.0 14952 0.0917 0.6895
0.1019 25.0 15575 0.0837 0.7112
0.1017 26.0 16198 0.0832 0.7148
0.1034 27.0 16821 0.0847 0.7220
0.0989 28.0 17444 0.0830 0.7256
0.0969 29.0 18067 0.0817 0.7148
0.0964 30.0 18690 0.0835 0.7112
0.0957 31.0 19313 0.0846 0.7148
0.0937 32.0 19936 0.0827 0.7112
0.0895 33.0 20559 0.0860 0.7220
0.0905 34.0 21182 0.0830 0.7220
0.0875 35.0 21805 0.0796 0.7184
0.0895 36.0 22428 0.0811 0.7076
0.0861 37.0 23051 0.0805 0.7112
0.0868 38.0 23674 0.0786 0.7040
0.0798 39.0 24297 0.0787 0.7148
0.0827 40.0 24920 0.0815 0.7112
0.0798 41.0 25543 0.0790 0.7184
0.079 42.0 26166 0.0813 0.7220
0.0794 43.0 26789 0.0802 0.7112
0.0766 44.0 27412 0.0796 0.7076
0.0766 45.0 28035 0.0813 0.7329
0.0765 46.0 28658 0.0810 0.7112
0.0744 47.0 29281 0.0781 0.7148
0.076 48.0 29904 0.0794 0.7148
0.0728 49.0 30527 0.0780 0.7112
0.0745 50.0 31150 0.0767 0.7256
0.0711 51.0 31773 0.0771 0.7220
0.0726 52.0 32396 0.0772 0.7256
0.0747 53.0 33019 0.0772 0.7184
0.0711 54.0 33642 0.0772 0.7256
0.0676 55.0 34265 0.0767 0.7329
0.0697 56.0 34888 0.0783 0.7220
0.0692 57.0 35511 0.0766 0.7184
0.067 58.0 36134 0.0773 0.7148
0.0676 59.0 36757 0.0774 0.7112
0.0678 60.0 37380 0.0768 0.7112

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
5

Dataset used to train dkqjrm/20230824002458