Edit model card

20230831011453

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4974
  • Accuracy: 0.5

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0007
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 80.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 340 0.5155 0.5
0.5119 2.0 680 0.5111 0.5
0.51 3.0 1020 0.5526 0.5078
0.51 4.0 1360 0.5257 0.5
0.5162 5.0 1700 0.5071 0.5
0.5071 6.0 2040 0.4929 0.5
0.5071 7.0 2380 0.4955 0.5
0.509 8.0 2720 0.5280 0.5
0.5022 9.0 3060 0.4958 0.5
0.5022 10.0 3400 0.4944 0.5
0.5017 11.0 3740 0.4931 0.5
0.5011 12.0 4080 0.4944 0.5
0.5011 13.0 4420 0.5177 0.5
0.4978 14.0 4760 0.4933 0.5
0.5039 15.0 5100 0.5001 0.5
0.5039 16.0 5440 0.4929 0.5
0.5008 17.0 5780 0.4961 0.5
0.4986 18.0 6120 0.4948 0.5
0.4986 19.0 6460 0.4993 0.5
0.499 20.0 6800 0.4943 0.5
0.4981 21.0 7140 0.4930 0.5
0.4981 22.0 7480 0.5119 0.5
0.5013 23.0 7820 0.4972 0.5
0.498 24.0 8160 0.4938 0.5
0.5 25.0 8500 0.4946 0.5
0.5 26.0 8840 0.5212 0.5
0.4994 27.0 9180 0.5028 0.5
0.4978 28.0 9520 0.4929 0.5
0.4978 29.0 9860 0.4993 0.5
0.4991 30.0 10200 0.4925 0.5
0.4987 31.0 10540 0.4929 0.5
0.4987 32.0 10880 0.5076 0.5
0.4989 33.0 11220 0.4931 0.5
0.4982 34.0 11560 0.5071 0.5
0.4982 35.0 11900 0.4959 0.5
0.4978 36.0 12240 0.5013 0.5
0.4982 37.0 12580 0.4927 0.5
0.4982 38.0 12920 0.4938 0.5
0.4968 39.0 13260 0.5018 0.5
0.4961 40.0 13600 0.4958 0.5
0.4961 41.0 13940 0.4928 0.5
0.4969 42.0 14280 0.4950 0.5
0.4951 43.0 14620 0.4929 0.5
0.4951 44.0 14960 0.4928 0.5
0.4964 45.0 15300 0.4965 0.5
0.4943 46.0 15640 0.4943 0.5
0.4943 47.0 15980 0.4982 0.5
0.4965 48.0 16320 0.4926 0.5
0.497 49.0 16660 0.4969 0.5
0.4959 50.0 17000 0.4930 0.5
0.4959 51.0 17340 0.4928 0.5
0.4932 52.0 17680 0.4926 0.5
0.4969 53.0 18020 0.4961 0.5
0.4969 54.0 18360 0.4935 0.5
0.4937 55.0 18700 0.4926 0.5
0.4937 56.0 19040 0.4926 0.5
0.4937 57.0 19380 0.5036 0.5
0.4951 58.0 19720 0.4930 0.5
0.4939 59.0 20060 0.5071 0.5
0.4939 60.0 20400 0.4927 0.5
0.4929 61.0 20740 0.4928 0.5
0.4926 62.0 21080 0.4928 0.5
0.4926 63.0 21420 0.4936 0.5
0.4917 64.0 21760 0.4967 0.5
0.4951 65.0 22100 0.4941 0.5
0.4951 66.0 22440 0.5071 0.5
0.4895 67.0 22780 0.4932 0.5
0.4939 68.0 23120 0.4930 0.5
0.4939 69.0 23460 0.4938 0.5
0.4919 70.0 23800 0.4935 0.5
0.4915 71.0 24140 0.4934 0.5
0.4915 72.0 24480 0.4962 0.5
0.4898 73.0 24820 0.4958 0.5
0.4919 74.0 25160 0.4967 0.5
0.4905 75.0 25500 0.4961 0.5
0.4905 76.0 25840 0.4986 0.5
0.4908 77.0 26180 0.4958 0.5
0.4897 78.0 26520 0.4974 0.5
0.4897 79.0 26860 0.4992 0.5
0.4897 80.0 27200 0.4974 0.5

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
4
Inference API
This model can be loaded on Inference API (serverless).

Dataset used to train dkqjrm/20230831011453