Edit model card

20230826123019

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5900
  • Accuracy: 0.65

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 80.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 25 0.6011 0.66
No log 2.0 50 0.5991 0.65
No log 3.0 75 0.5983 0.65
No log 4.0 100 0.6063 0.65
No log 5.0 125 0.5973 0.65
No log 6.0 150 0.6049 0.65
No log 7.0 175 0.6031 0.65
No log 8.0 200 0.6001 0.65
No log 9.0 225 0.5969 0.64
No log 10.0 250 0.6007 0.65
No log 11.0 275 0.6016 0.65
No log 12.0 300 0.5992 0.65
No log 13.0 325 0.5968 0.65
No log 14.0 350 0.5968 0.65
No log 15.0 375 0.6000 0.65
No log 16.0 400 0.6000 0.65
No log 17.0 425 0.5883 0.66
No log 18.0 450 0.5920 0.65
No log 19.0 475 0.6035 0.62
0.6519 20.0 500 0.6075 0.64
0.6519 21.0 525 0.5919 0.65
0.6519 22.0 550 0.5951 0.63
0.6519 23.0 575 0.6037 0.61
0.6519 24.0 600 0.6058 0.62
0.6519 25.0 625 0.5944 0.65
0.6519 26.0 650 0.5938 0.65
0.6519 27.0 675 0.5909 0.66
0.6519 28.0 700 0.5914 0.65
0.6519 29.0 725 0.5902 0.66
0.6519 30.0 750 0.5906 0.66
0.6519 31.0 775 0.5936 0.65
0.6519 32.0 800 0.5960 0.66
0.6519 33.0 825 0.5953 0.65
0.6519 34.0 850 0.5970 0.65
0.6519 35.0 875 0.5937 0.65
0.6519 36.0 900 0.5954 0.64
0.6519 37.0 925 0.5993 0.63
0.6519 38.0 950 0.5905 0.65
0.6519 39.0 975 0.5898 0.65
0.6395 40.0 1000 0.5947 0.65
0.6395 41.0 1025 0.5966 0.64
0.6395 42.0 1050 0.5953 0.65
0.6395 43.0 1075 0.5968 0.64
0.6395 44.0 1100 0.5934 0.65
0.6395 45.0 1125 0.5948 0.66
0.6395 46.0 1150 0.5958 0.65
0.6395 47.0 1175 0.5928 0.65
0.6395 48.0 1200 0.5922 0.65
0.6395 49.0 1225 0.5929 0.65
0.6395 50.0 1250 0.5967 0.64
0.6395 51.0 1275 0.5908 0.65
0.6395 52.0 1300 0.5930 0.66
0.6395 53.0 1325 0.5910 0.65
0.6395 54.0 1350 0.5931 0.65
0.6395 55.0 1375 0.5900 0.66
0.6395 56.0 1400 0.5925 0.65
0.6395 57.0 1425 0.5938 0.66
0.6395 58.0 1450 0.5963 0.65
0.6395 59.0 1475 0.5955 0.64
0.6331 60.0 1500 0.5935 0.65
0.6331 61.0 1525 0.5937 0.66
0.6331 62.0 1550 0.5924 0.65
0.6331 63.0 1575 0.5909 0.65
0.6331 64.0 1600 0.5891 0.65
0.6331 65.0 1625 0.5881 0.65
0.6331 66.0 1650 0.5884 0.65
0.6331 67.0 1675 0.5893 0.65
0.6331 68.0 1700 0.5900 0.65
0.6331 69.0 1725 0.5908 0.65
0.6331 70.0 1750 0.5912 0.65
0.6331 71.0 1775 0.5914 0.65
0.6331 72.0 1800 0.5901 0.65
0.6331 73.0 1825 0.5898 0.65
0.6331 74.0 1850 0.5896 0.65
0.6331 75.0 1875 0.5905 0.65
0.6331 76.0 1900 0.5901 0.65
0.6331 77.0 1925 0.5901 0.65
0.6331 78.0 1950 0.5900 0.65
0.6331 79.0 1975 0.5900 0.65
0.6276 80.0 2000 0.5900 0.65

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
8
Unable to determine this model’s pipeline type. Check the docs .

Dataset used to train dkqjrm/20230826123019