Edit model card

20230826083203

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2932
  • Accuracy: 0.6

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 80.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 25 0.5569 0.62
No log 2.0 50 0.3272 0.4
No log 3.0 75 0.2999 0.48
No log 4.0 100 0.3037 0.58
No log 5.0 125 0.3092 0.39
No log 6.0 150 0.3147 0.37
No log 7.0 175 0.2872 0.61
No log 8.0 200 0.2897 0.68
No log 9.0 225 0.2950 0.41
No log 10.0 250 0.2779 0.63
No log 11.0 275 0.2977 0.41
No log 12.0 300 0.2909 0.59
No log 13.0 325 0.2940 0.49
No log 14.0 350 0.2929 0.49
No log 15.0 375 0.2948 0.49
No log 16.0 400 0.2935 0.57
No log 17.0 425 0.2949 0.43
No log 18.0 450 0.2925 0.59
No log 19.0 475 0.2927 0.57
1.2287 20.0 500 0.2934 0.58
1.2287 21.0 525 0.2947 0.44
1.2287 22.0 550 0.2934 0.6
1.2287 23.0 575 0.2930 0.6
1.2287 24.0 600 0.2944 0.4
1.2287 25.0 625 0.2970 0.39
1.2287 26.0 650 0.2949 0.39
1.2287 27.0 675 0.2942 0.43
1.2287 28.0 700 0.2940 0.43
1.2287 29.0 725 0.2933 0.58
1.2287 30.0 750 0.2930 0.62
1.2287 31.0 775 0.2934 0.6
1.2287 32.0 800 0.2934 0.57
1.2287 33.0 825 0.2932 0.54
1.2287 34.0 850 0.2921 0.54
1.2287 35.0 875 0.2950 0.44
1.2287 36.0 900 0.2944 0.41
1.2287 37.0 925 0.2941 0.43
1.2287 38.0 950 0.2930 0.55
1.2287 39.0 975 0.2932 0.57
0.8805 40.0 1000 0.2923 0.57
0.8805 41.0 1025 0.2932 0.61
0.8805 42.0 1050 0.2936 0.46
0.8805 43.0 1075 0.2924 0.55
0.8805 44.0 1100 0.2937 0.44
0.8805 45.0 1125 0.2927 0.55
0.8805 46.0 1150 0.2923 0.56
0.8805 47.0 1175 0.2930 0.6
0.8805 48.0 1200 0.2936 0.43
0.8805 49.0 1225 0.2935 0.56
0.8805 50.0 1250 0.2937 0.46
0.8805 51.0 1275 0.2929 0.59
0.8805 52.0 1300 0.2932 0.55
0.8805 53.0 1325 0.2940 0.48
0.8805 54.0 1350 0.2933 0.53
0.8805 55.0 1375 0.2934 0.55
0.8805 56.0 1400 0.2936 0.49
0.8805 57.0 1425 0.2928 0.59
0.8805 58.0 1450 0.2927 0.53
0.8805 59.0 1475 0.2930 0.6
0.6612 60.0 1500 0.2936 0.47
0.6612 61.0 1525 0.2933 0.53
0.6612 62.0 1550 0.2932 0.62
0.6612 63.0 1575 0.2937 0.41
0.6612 64.0 1600 0.2932 0.54
0.6612 65.0 1625 0.2940 0.42
0.6612 66.0 1650 0.2931 0.56
0.6612 67.0 1675 0.2937 0.36
0.6612 68.0 1700 0.2930 0.63
0.6612 69.0 1725 0.2934 0.63
0.6612 70.0 1750 0.2937 0.36
0.6612 71.0 1775 0.2930 0.63
0.6612 72.0 1800 0.2932 0.63
0.6612 73.0 1825 0.2930 0.61
0.6612 74.0 1850 0.2932 0.53
0.6612 75.0 1875 0.2932 0.58
0.6612 76.0 1900 0.2935 0.53
0.6612 77.0 1925 0.2931 0.62
0.6612 78.0 1950 0.2933 0.54
0.6612 79.0 1975 0.2932 0.61
0.5295 80.0 2000 0.2932 0.6

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
8
Unable to determine this model’s pipeline type. Check the docs .

Dataset used to train dkqjrm/20230826083203