Edit model card

20230826100309

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2920
  • Accuracy: 0.4

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 80.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 25 0.3608 0.44
No log 2.0 50 0.2890 0.57
No log 3.0 75 0.2961 0.58
No log 4.0 100 0.2865 0.65
No log 5.0 125 0.2901 0.58
No log 6.0 150 0.2933 0.46
No log 7.0 175 0.3291 0.64
No log 8.0 200 0.2864 0.62
No log 9.0 225 0.2979 0.42
No log 10.0 250 0.3035 0.63
No log 11.0 275 0.2902 0.59
No log 12.0 300 0.2917 0.5
No log 13.0 325 0.2935 0.44
No log 14.0 350 0.3057 0.44
No log 15.0 375 0.2980 0.45
No log 16.0 400 0.2947 0.47
No log 17.0 425 0.2945 0.5
No log 18.0 450 0.2924 0.49
No log 19.0 475 0.2922 0.55
1.1902 20.0 500 0.2923 0.45
1.1902 21.0 525 0.2864 0.55
1.1902 22.0 550 0.2925 0.42
1.1902 23.0 575 0.2910 0.58
1.1902 24.0 600 0.2895 0.58
1.1902 25.0 625 0.2918 0.62
1.1902 26.0 650 0.2921 0.42
1.1902 27.0 675 0.2918 0.58
1.1902 28.0 700 0.2910 0.6
1.1902 29.0 725 0.2919 0.57
1.1902 30.0 750 0.2920 0.48
1.1902 31.0 775 0.2922 0.41
1.1902 32.0 800 0.2920 0.53
1.1902 33.0 825 0.2920 0.51
1.1902 34.0 850 0.2919 0.54
1.1902 35.0 875 0.2920 0.52
1.1902 36.0 900 0.2921 0.39
1.1902 37.0 925 0.2920 0.53
1.1902 38.0 950 0.2920 0.49
1.1902 39.0 975 0.2922 0.4
0.8276 40.0 1000 0.2919 0.58
0.8276 41.0 1025 0.2918 0.62
0.8276 42.0 1050 0.2918 0.61
0.8276 43.0 1075 0.2922 0.42
0.8276 44.0 1100 0.2921 0.43
0.8276 45.0 1125 0.2920 0.42
0.8276 46.0 1150 0.2920 0.42
0.8276 47.0 1175 0.2920 0.35
0.8276 48.0 1200 0.2920 0.54
0.8276 49.0 1225 0.2920 0.6
0.8276 50.0 1250 0.2920 0.52
0.8276 51.0 1275 0.2920 0.37
0.8276 52.0 1300 0.2920 0.45
0.8276 53.0 1325 0.2920 0.44
0.8276 54.0 1350 0.2920 0.59
0.8276 55.0 1375 0.2920 0.44
0.8276 56.0 1400 0.2920 0.58
0.8276 57.0 1425 0.2920 0.57
0.8276 58.0 1450 0.2920 0.46
0.8276 59.0 1475 0.2920 0.42
0.6389 60.0 1500 0.2920 0.37
0.6389 61.0 1525 0.2919 0.6
0.6389 62.0 1550 0.2919 0.6
0.6389 63.0 1575 0.2920 0.55
0.6389 64.0 1600 0.2920 0.52
0.6389 65.0 1625 0.2920 0.5
0.6389 66.0 1650 0.2920 0.36
0.6389 67.0 1675 0.2920 0.58
0.6389 68.0 1700 0.2920 0.38
0.6389 69.0 1725 0.2920 0.58
0.6389 70.0 1750 0.2920 0.53
0.6389 71.0 1775 0.2920 0.37
0.6389 72.0 1800 0.2920 0.39
0.6389 73.0 1825 0.2920 0.36
0.6389 74.0 1850 0.2920 0.43
0.6389 75.0 1875 0.2920 0.38
0.6389 76.0 1900 0.2920 0.43
0.6389 77.0 1925 0.2920 0.37
0.6389 78.0 1950 0.2920 0.37
0.6389 79.0 1975 0.2920 0.38
0.5225 80.0 2000 0.2920 0.4

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
8
Unable to determine this model’s pipeline type. Check the docs .

Dataset used to train dkqjrm/20230826100309