Edit model card

20230824043649

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0771
  • Accuracy: 0.7365

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.003
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
0.4513 1.0 623 0.4036 0.4729
0.321 2.0 1246 0.3454 0.4729
0.339 3.0 1869 0.1727 0.5271
0.3594 4.0 2492 0.4321 0.4729
0.3103 5.0 3115 0.2311 0.5415
0.3042 6.0 3738 0.1428 0.6679
0.2996 7.0 4361 0.2423 0.5668
0.274 8.0 4984 0.1331 0.6895
0.2824 9.0 5607 0.1173 0.6931
0.2458 10.0 6230 0.1350 0.6968
0.2005 11.0 6853 0.1456 0.5884
0.1689 12.0 7476 0.1289 0.6787
0.1644 13.0 8099 0.1109 0.6931
0.1578 14.0 8722 0.1143 0.7040
0.1502 15.0 9345 0.1178 0.6968
0.141 16.0 9968 0.0974 0.6968
0.1365 17.0 10591 0.0980 0.6787
0.1327 18.0 11214 0.1128 0.6931
0.1352 19.0 11837 0.1543 0.6390
0.1324 20.0 12460 0.0938 0.7184
0.1274 21.0 13083 0.0907 0.7112
0.1244 22.0 13706 0.1093 0.7112
0.1227 23.0 14329 0.1061 0.7076
0.1142 24.0 14952 0.0972 0.7112
0.1094 25.0 15575 0.0872 0.7184
0.1099 26.0 16198 0.0904 0.7292
0.1086 27.0 16821 0.0912 0.7040
0.1083 28.0 17444 0.0850 0.7148
0.1061 29.0 18067 0.0832 0.7184
0.1008 30.0 18690 0.0951 0.7292
0.1036 31.0 19313 0.0879 0.7220
0.1024 32.0 19936 0.0850 0.7220
0.0945 33.0 20559 0.0828 0.7220
0.0961 34.0 21182 0.0838 0.7329
0.0935 35.0 21805 0.0814 0.7256
0.097 36.0 22428 0.0812 0.7329
0.0925 37.0 23051 0.0810 0.7292
0.0911 38.0 23674 0.0826 0.7256
0.0855 39.0 24297 0.0815 0.7329
0.0895 40.0 24920 0.0826 0.7329
0.0847 41.0 25543 0.0821 0.7292
0.0864 42.0 26166 0.0797 0.7292
0.0848 43.0 26789 0.0823 0.7256
0.0817 44.0 27412 0.0791 0.7329
0.0829 45.0 28035 0.0795 0.7220
0.0826 46.0 28658 0.0789 0.7365
0.0816 47.0 29281 0.0783 0.7220
0.0821 48.0 29904 0.0796 0.7437
0.0798 49.0 30527 0.0800 0.7220
0.0782 50.0 31150 0.0784 0.7437
0.079 51.0 31773 0.0784 0.7401
0.0797 52.0 32396 0.0795 0.7329
0.0804 53.0 33019 0.0784 0.7365
0.0762 54.0 33642 0.0770 0.7329
0.0727 55.0 34265 0.0777 0.7365
0.0749 56.0 34888 0.0786 0.7329
0.0737 57.0 35511 0.0773 0.7292
0.0734 58.0 36134 0.0776 0.7292
0.0737 59.0 36757 0.0777 0.7365
0.0736 60.0 37380 0.0771 0.7365

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
1

Dataset used to train dkqjrm/20230824043649