Edit model card

20230826081833

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6393
  • Accuracy: 0.69

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 80.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 25 0.7348 0.6
No log 2.0 50 0.6045 0.61
No log 3.0 75 0.9239 0.62
No log 4.0 100 0.6379 0.69
No log 5.0 125 0.5724 0.72
No log 6.0 150 1.2083 0.69
No log 7.0 175 1.3074 0.67
No log 8.0 200 1.1626 0.7
No log 9.0 225 1.0019 0.64
No log 10.0 250 0.6240 0.73
No log 11.0 275 1.0829 0.66
No log 12.0 300 0.8053 0.66
No log 13.0 325 1.1526 0.63
No log 14.0 350 1.2006 0.69
No log 15.0 375 1.1382 0.67
No log 16.0 400 1.1345 0.71
No log 17.0 425 1.5029 0.67
No log 18.0 450 1.3780 0.67
No log 19.0 475 1.1811 0.66
1.3151 20.0 500 1.2461 0.7
1.3151 21.0 525 1.2269 0.68
1.3151 22.0 550 1.1515 0.68
1.3151 23.0 575 0.9944 0.66
1.3151 24.0 600 1.2708 0.67
1.3151 25.0 625 1.5817 0.65
1.3151 26.0 650 1.0934 0.71
1.3151 27.0 675 1.4179 0.67
1.3151 28.0 700 1.4260 0.65
1.3151 29.0 725 1.3818 0.65
1.3151 30.0 750 1.7166 0.66
1.3151 31.0 775 1.1710 0.64
1.3151 32.0 800 1.0660 0.64
1.3151 33.0 825 1.0127 0.69
1.3151 34.0 850 0.9810 0.68
1.3151 35.0 875 1.1077 0.7
1.3151 36.0 900 1.0629 0.66
1.3151 37.0 925 1.5933 0.69
1.3151 38.0 950 1.1322 0.71
1.3151 39.0 975 1.0735 0.73
0.6791 40.0 1000 0.8940 0.72
0.6791 41.0 1025 0.9349 0.67
0.6791 42.0 1050 0.8962 0.67
0.6791 43.0 1075 1.0663 0.69
0.6791 44.0 1100 0.9681 0.69
0.6791 45.0 1125 0.7694 0.68
0.6791 46.0 1150 1.0311 0.71
0.6791 47.0 1175 0.7407 0.7
0.6791 48.0 1200 0.6861 0.69
0.6791 49.0 1225 0.9920 0.69
0.6791 50.0 1250 0.7187 0.69
0.6791 51.0 1275 0.7602 0.72
0.6791 52.0 1300 0.7285 0.69
0.6791 53.0 1325 0.8233 0.68
0.6791 54.0 1350 0.7932 0.7
0.6791 55.0 1375 0.8861 0.71
0.6791 56.0 1400 0.7877 0.71
0.6791 57.0 1425 0.7689 0.7
0.6791 58.0 1450 0.7919 0.7
0.6791 59.0 1475 0.7441 0.7
0.3594 60.0 1500 0.8327 0.69
0.3594 61.0 1525 0.6414 0.71
0.3594 62.0 1550 0.6702 0.71
0.3594 63.0 1575 0.6862 0.71
0.3594 64.0 1600 0.6349 0.68
0.3594 65.0 1625 0.6800 0.69
0.3594 66.0 1650 0.7005 0.69
0.3594 67.0 1675 0.7058 0.71
0.3594 68.0 1700 0.6880 0.73
0.3594 69.0 1725 0.6774 0.72
0.3594 70.0 1750 0.6816 0.73
0.3594 71.0 1775 0.7138 0.72
0.3594 72.0 1800 0.6311 0.69
0.3594 73.0 1825 0.6579 0.69
0.3594 74.0 1850 0.6956 0.69
0.3594 75.0 1875 0.6341 0.69
0.3594 76.0 1900 0.6722 0.7
0.3594 77.0 1925 0.6459 0.7
0.3594 78.0 1950 0.6351 0.68
0.3594 79.0 1975 0.6436 0.68
0.2323 80.0 2000 0.6393 0.69

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
8
Unable to determine this model’s pipeline type. Check the docs .

Dataset used to train dkqjrm/20230826081833