Edit model card

20230826130711

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2867
  • Accuracy: 0.62

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 80.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 25 0.2952 0.64
No log 2.0 50 0.2895 0.57
No log 3.0 75 0.2922 0.61
No log 4.0 100 0.2938 0.64
No log 5.0 125 0.2885 0.63
No log 6.0 150 0.2945 0.48
No log 7.0 175 0.2860 0.67
No log 8.0 200 0.2888 0.66
No log 9.0 225 0.2894 0.51
No log 10.0 250 0.2903 0.56
No log 11.0 275 0.2868 0.66
No log 12.0 300 0.2880 0.66
No log 13.0 325 0.2947 0.54
No log 14.0 350 0.2957 0.64
No log 15.0 375 0.2877 0.66
No log 16.0 400 0.2865 0.68
No log 17.0 425 0.2850 0.69
No log 18.0 450 0.2846 0.66
No log 19.0 475 0.2911 0.59
0.4684 20.0 500 0.2961 0.64
0.4684 21.0 525 0.2872 0.63
0.4684 22.0 550 0.2880 0.64
0.4684 23.0 575 0.2951 0.51
0.4684 24.0 600 0.2897 0.64
0.4684 25.0 625 0.2884 0.64
0.4684 26.0 650 0.2895 0.64
0.4684 27.0 675 0.2872 0.61
0.4684 28.0 700 0.2890 0.64
0.4684 29.0 725 0.2887 0.66
0.4684 30.0 750 0.2886 0.63
0.4684 31.0 775 0.2875 0.6
0.4684 32.0 800 0.2882 0.65
0.4684 33.0 825 0.2886 0.58
0.4684 34.0 850 0.2970 0.64
0.4684 35.0 875 0.2875 0.59
0.4684 36.0 900 0.2888 0.63
0.4684 37.0 925 0.2868 0.63
0.4684 38.0 950 0.2863 0.64
0.4684 39.0 975 0.2911 0.63
0.4634 40.0 1000 0.2867 0.63
0.4634 41.0 1025 0.2936 0.54
0.4634 42.0 1050 0.2965 0.6
0.4634 43.0 1075 0.2872 0.62
0.4634 44.0 1100 0.2862 0.65
0.4634 45.0 1125 0.2871 0.65
0.4634 46.0 1150 0.2914 0.63
0.4634 47.0 1175 0.2925 0.64
0.4634 48.0 1200 0.2883 0.64
0.4634 49.0 1225 0.2896 0.65
0.4634 50.0 1250 0.2866 0.64
0.4634 51.0 1275 0.2857 0.64
0.4634 52.0 1300 0.2892 0.64
0.4634 53.0 1325 0.2861 0.65
0.4634 54.0 1350 0.2861 0.63
0.4634 55.0 1375 0.2872 0.65
0.4634 56.0 1400 0.2861 0.64
0.4634 57.0 1425 0.2865 0.65
0.4634 58.0 1450 0.2880 0.63
0.4634 59.0 1475 0.2898 0.63
0.4583 60.0 1500 0.2900 0.63
0.4583 61.0 1525 0.2896 0.64
0.4583 62.0 1550 0.2886 0.63
0.4583 63.0 1575 0.2888 0.63
0.4583 64.0 1600 0.2891 0.64
0.4583 65.0 1625 0.2874 0.63
0.4583 66.0 1650 0.2875 0.62
0.4583 67.0 1675 0.2882 0.62
0.4583 68.0 1700 0.2863 0.62
0.4583 69.0 1725 0.2867 0.63
0.4583 70.0 1750 0.2865 0.64
0.4583 71.0 1775 0.2863 0.64
0.4583 72.0 1800 0.2862 0.64
0.4583 73.0 1825 0.2864 0.64
0.4583 74.0 1850 0.2862 0.64
0.4583 75.0 1875 0.2866 0.64
0.4583 76.0 1900 0.2868 0.63
0.4583 77.0 1925 0.2866 0.63
0.4583 78.0 1950 0.2867 0.63
0.4583 79.0 1975 0.2867 0.62
0.4597 80.0 2000 0.2867 0.62

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
7
Unable to determine this model’s pipeline type. Check the docs .

Dataset used to train dkqjrm/20230826130711