Edit model card

20230826105641

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6024
  • Accuracy: 0.64

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 80.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 25 0.6078 0.65
No log 2.0 50 0.5963 0.66
No log 3.0 75 0.6125 0.65
No log 4.0 100 0.6042 0.66
No log 5.0 125 0.6065 0.66
No log 6.0 150 0.6020 0.65
No log 7.0 175 0.5987 0.65
No log 8.0 200 0.6016 0.66
No log 9.0 225 0.6066 0.66
No log 10.0 250 0.6112 0.66
No log 11.0 275 0.6085 0.66
No log 12.0 300 0.5976 0.66
No log 13.0 325 0.6074 0.66
No log 14.0 350 0.6060 0.65
No log 15.0 375 0.6254 0.65
No log 16.0 400 0.6031 0.66
No log 17.0 425 0.6011 0.67
No log 18.0 450 0.6063 0.66
No log 19.0 475 0.6031 0.65
0.6484 20.0 500 0.6013 0.65
0.6484 21.0 525 0.6041 0.65
0.6484 22.0 550 0.6037 0.65
0.6484 23.0 575 0.6046 0.65
0.6484 24.0 600 0.6072 0.66
0.6484 25.0 625 0.5980 0.66
0.6484 26.0 650 0.6039 0.64
0.6484 27.0 675 0.6025 0.65
0.6484 28.0 700 0.6062 0.65
0.6484 29.0 725 0.6056 0.64
0.6484 30.0 750 0.6091 0.61
0.6484 31.0 775 0.6037 0.65
0.6484 32.0 800 0.6037 0.63
0.6484 33.0 825 0.6175 0.64
0.6484 34.0 850 0.6089 0.62
0.6484 35.0 875 0.6076 0.64
0.6484 36.0 900 0.6073 0.64
0.6484 37.0 925 0.6059 0.64
0.6484 38.0 950 0.6109 0.63
0.6484 39.0 975 0.6090 0.64
0.6362 40.0 1000 0.6080 0.64
0.6362 41.0 1025 0.5994 0.64
0.6362 42.0 1050 0.6034 0.64
0.6362 43.0 1075 0.6113 0.6
0.6362 44.0 1100 0.6131 0.64
0.6362 45.0 1125 0.6150 0.61
0.6362 46.0 1150 0.6115 0.63
0.6362 47.0 1175 0.6055 0.64
0.6362 48.0 1200 0.6033 0.64
0.6362 49.0 1225 0.6047 0.64
0.6362 50.0 1250 0.6037 0.64
0.6362 51.0 1275 0.6010 0.63
0.6362 52.0 1300 0.5988 0.64
0.6362 53.0 1325 0.5991 0.64
0.6362 54.0 1350 0.6019 0.64
0.6362 55.0 1375 0.6002 0.64
0.6362 56.0 1400 0.6006 0.64
0.6362 57.0 1425 0.5992 0.63
0.6362 58.0 1450 0.5992 0.63
0.6362 59.0 1475 0.5992 0.64
0.6341 60.0 1500 0.6026 0.64
0.6341 61.0 1525 0.6022 0.64
0.6341 62.0 1550 0.6026 0.64
0.6341 63.0 1575 0.6036 0.64
0.6341 64.0 1600 0.6039 0.64
0.6341 65.0 1625 0.6041 0.64
0.6341 66.0 1650 0.6034 0.64
0.6341 67.0 1675 0.6049 0.64
0.6341 68.0 1700 0.6027 0.64
0.6341 69.0 1725 0.6057 0.64
0.6341 70.0 1750 0.6056 0.64
0.6341 71.0 1775 0.6048 0.64
0.6341 72.0 1800 0.6019 0.64
0.6341 73.0 1825 0.6021 0.64
0.6341 74.0 1850 0.6018 0.64
0.6341 75.0 1875 0.6027 0.64
0.6341 76.0 1900 0.6025 0.64
0.6341 77.0 1925 0.6021 0.64
0.6341 78.0 1950 0.6023 0.64
0.6341 79.0 1975 0.6024 0.64
0.626 80.0 2000 0.6024 0.64

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
8
Unable to determine this model’s pipeline type. Check the docs .

Dataset used to train dkqjrm/20230826105641