Edit model card

20230826051154

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2897
  • Accuracy: 0.7

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.02
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 80.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 25 0.4805 0.57
No log 2.0 50 0.3152 0.59
No log 3.0 75 0.3020 0.62
No log 4.0 100 0.2893 0.59
No log 5.0 125 0.2988 0.4
No log 6.0 150 0.2916 0.57
No log 7.0 175 0.2947 0.62
No log 8.0 200 0.2888 0.61
No log 9.0 225 0.2915 0.53
No log 10.0 250 0.2938 0.63
No log 11.0 275 0.2985 0.36
No log 12.0 300 0.2854 0.65
No log 13.0 325 0.2870 0.49
No log 14.0 350 0.2802 0.64
No log 15.0 375 0.2801 0.61
No log 16.0 400 0.2806 0.63
No log 17.0 425 0.2810 0.6
No log 18.0 450 0.2888 0.66
No log 19.0 475 0.2780 0.63
0.6923 20.0 500 0.2803 0.6
0.6923 21.0 525 0.2768 0.65
0.6923 22.0 550 0.2744 0.65
0.6923 23.0 575 0.2831 0.66
0.6923 24.0 600 0.2743 0.67
0.6923 25.0 625 0.2847 0.69
0.6923 26.0 650 0.2737 0.71
0.6923 27.0 675 0.2817 0.65
0.6923 28.0 700 0.2770 0.68
0.6923 29.0 725 0.2887 0.67
0.6923 30.0 750 0.2780 0.64
0.6923 31.0 775 0.2707 0.66
0.6923 32.0 800 0.2889 0.7
0.6923 33.0 825 0.2821 0.68
0.6923 34.0 850 0.2735 0.7
0.6923 35.0 875 0.2772 0.66
0.6923 36.0 900 0.2766 0.67
0.6923 37.0 925 0.2862 0.68
0.6923 38.0 950 0.2745 0.65
0.6923 39.0 975 0.2828 0.66
0.5864 40.0 1000 0.3264 0.68
0.5864 41.0 1025 0.2750 0.68
0.5864 42.0 1050 0.2831 0.67
0.5864 43.0 1075 0.2725 0.67
0.5864 44.0 1100 0.2909 0.68
0.5864 45.0 1125 0.2841 0.69
0.5864 46.0 1150 0.3126 0.69
0.5864 47.0 1175 0.2892 0.72
0.5864 48.0 1200 0.2887 0.7
0.5864 49.0 1225 0.2834 0.7
0.5864 50.0 1250 0.2731 0.66
0.5864 51.0 1275 0.2888 0.68
0.5864 52.0 1300 0.3080 0.67
0.5864 53.0 1325 0.2862 0.67
0.5864 54.0 1350 0.2772 0.67
0.5864 55.0 1375 0.2791 0.67
0.5864 56.0 1400 0.2930 0.68
0.5864 57.0 1425 0.2783 0.66
0.5864 58.0 1450 0.2855 0.67
0.5864 59.0 1475 0.2850 0.69
0.4926 60.0 1500 0.2899 0.69
0.4926 61.0 1525 0.2797 0.67
0.4926 62.0 1550 0.3322 0.69
0.4926 63.0 1575 0.2762 0.69
0.4926 64.0 1600 0.2816 0.7
0.4926 65.0 1625 0.2952 0.68
0.4926 66.0 1650 0.2794 0.68
0.4926 67.0 1675 0.2873 0.69
0.4926 68.0 1700 0.2835 0.69
0.4926 69.0 1725 0.2908 0.68
0.4926 70.0 1750 0.3008 0.68
0.4926 71.0 1775 0.2893 0.68
0.4926 72.0 1800 0.2826 0.68
0.4926 73.0 1825 0.2919 0.68
0.4926 74.0 1850 0.2832 0.7
0.4926 75.0 1875 0.2830 0.7
0.4926 76.0 1900 0.2809 0.69
0.4926 77.0 1925 0.2822 0.69
0.4926 78.0 1950 0.2884 0.69
0.4926 79.0 1975 0.2910 0.7
0.4369 80.0 2000 0.2897 0.7

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
7
Unable to determine this model’s pipeline type. Check the docs .

Dataset used to train dkqjrm/20230826051154