Edit model card

20230826064921

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2753
  • Accuracy: 0.71

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.02
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 80.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 25 0.3969 0.6
No log 2.0 50 0.4709 0.5
No log 3.0 75 0.3341 0.42
No log 4.0 100 0.3011 0.54
No log 5.0 125 0.3119 0.36
No log 6.0 150 0.3297 0.37
No log 7.0 175 0.2928 0.53
No log 8.0 200 0.3079 0.63
No log 9.0 225 0.2875 0.61
No log 10.0 250 0.2906 0.54
No log 11.0 275 0.2904 0.62
No log 12.0 300 0.2946 0.52
No log 13.0 325 0.2942 0.51
No log 14.0 350 0.2935 0.56
No log 15.0 375 0.2913 0.58
No log 16.0 400 0.2886 0.6
No log 17.0 425 0.2900 0.6
No log 18.0 450 0.2874 0.59
No log 19.0 475 0.2910 0.6
0.6674 20.0 500 0.2931 0.47
0.6674 21.0 525 0.2909 0.51
0.6674 22.0 550 0.2855 0.62
0.6674 23.0 575 0.2881 0.61
0.6674 24.0 600 0.2878 0.6
0.6674 25.0 625 0.2874 0.57
0.6674 26.0 650 0.2857 0.54
0.6674 27.0 675 0.2871 0.6
0.6674 28.0 700 0.2864 0.59
0.6674 29.0 725 0.2862 0.62
0.6674 30.0 750 0.2866 0.58
0.6674 31.0 775 0.2837 0.63
0.6674 32.0 800 0.2859 0.58
0.6674 33.0 825 0.2841 0.59
0.6674 34.0 850 0.2878 0.62
0.6674 35.0 875 0.2889 0.61
0.6674 36.0 900 0.2830 0.59
0.6674 37.0 925 0.2824 0.59
0.6674 38.0 950 0.2801 0.63
0.6674 39.0 975 0.2931 0.65
0.5477 40.0 1000 0.2788 0.64
0.5477 41.0 1025 0.2892 0.63
0.5477 42.0 1050 0.2937 0.58
0.5477 43.0 1075 0.2886 0.66
0.5477 44.0 1100 0.2842 0.62
0.5477 45.0 1125 0.2857 0.6
0.5477 46.0 1150 0.2834 0.62
0.5477 47.0 1175 0.2824 0.56
0.5477 48.0 1200 0.2866 0.65
0.5477 49.0 1225 0.2801 0.63
0.5477 50.0 1250 0.2851 0.62
0.5477 51.0 1275 0.2829 0.6
0.5477 52.0 1300 0.2900 0.59
0.5477 53.0 1325 0.2782 0.59
0.5477 54.0 1350 0.2793 0.59
0.5477 55.0 1375 0.2809 0.6
0.5477 56.0 1400 0.2815 0.64
0.5477 57.0 1425 0.2798 0.68
0.5477 58.0 1450 0.2831 0.67
0.5477 59.0 1475 0.2795 0.66
0.4601 60.0 1500 0.2747 0.68
0.4601 61.0 1525 0.2725 0.73
0.4601 62.0 1550 0.2840 0.66
0.4601 63.0 1575 0.2739 0.67
0.4601 64.0 1600 0.2796 0.69
0.4601 65.0 1625 0.2782 0.65
0.4601 66.0 1650 0.2757 0.7
0.4601 67.0 1675 0.2759 0.69
0.4601 68.0 1700 0.2779 0.67
0.4601 69.0 1725 0.2822 0.67
0.4601 70.0 1750 0.2813 0.65
0.4601 71.0 1775 0.2818 0.68
0.4601 72.0 1800 0.2865 0.69
0.4601 73.0 1825 0.2770 0.71
0.4601 74.0 1850 0.2822 0.69
0.4601 75.0 1875 0.2783 0.71
0.4601 76.0 1900 0.2764 0.71
0.4601 77.0 1925 0.2772 0.69
0.4601 78.0 1950 0.2759 0.7
0.4601 79.0 1975 0.2751 0.72
0.4329 80.0 2000 0.2753 0.71

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
7
Unable to determine this model’s pipeline type. Check the docs .

Dataset used to train dkqjrm/20230826064921