roberta-large-sst-2-16-13

This model is a fine-tuned version of roberta-large on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4022
  • Accuracy: 0.7812

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 150

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 1 0.6926 0.5
No log 2.0 2 0.6926 0.5
No log 3.0 3 0.6926 0.5
No log 4.0 4 0.6926 0.5
No log 5.0 5 0.6926 0.5
No log 6.0 6 0.6926 0.5
No log 7.0 7 0.6925 0.5
No log 8.0 8 0.6925 0.5
No log 9.0 9 0.6925 0.5
0.6898 10.0 10 0.6925 0.5
0.6898 11.0 11 0.6924 0.5
0.6898 12.0 12 0.6924 0.5
0.6898 13.0 13 0.6924 0.5
0.6898 14.0 14 0.6924 0.5
0.6898 15.0 15 0.6923 0.5
0.6898 16.0 16 0.6923 0.5
0.6898 17.0 17 0.6922 0.5
0.6898 18.0 18 0.6922 0.5
0.6898 19.0 19 0.6922 0.5
0.694 20.0 20 0.6921 0.5
0.694 21.0 21 0.6921 0.5
0.694 22.0 22 0.6920 0.5
0.694 23.0 23 0.6920 0.5
0.694 24.0 24 0.6920 0.5
0.694 25.0 25 0.6919 0.5
0.694 26.0 26 0.6919 0.5
0.694 27.0 27 0.6918 0.5
0.694 28.0 28 0.6918 0.5
0.694 29.0 29 0.6918 0.5
0.7021 30.0 30 0.6917 0.5
0.7021 31.0 31 0.6916 0.5
0.7021 32.0 32 0.6916 0.5
0.7021 33.0 33 0.6916 0.5
0.7021 34.0 34 0.6915 0.5
0.7021 35.0 35 0.6915 0.5
0.7021 36.0 36 0.6914 0.5
0.7021 37.0 37 0.6914 0.5
0.7021 38.0 38 0.6913 0.5
0.7021 39.0 39 0.6913 0.5
0.6798 40.0 40 0.6913 0.5
0.6798 41.0 41 0.6912 0.5
0.6798 42.0 42 0.6911 0.5
0.6798 43.0 43 0.6910 0.5
0.6798 44.0 44 0.6909 0.5
0.6798 45.0 45 0.6908 0.5
0.6798 46.0 46 0.6907 0.5
0.6798 47.0 47 0.6906 0.5
0.6798 48.0 48 0.6905 0.5
0.6798 49.0 49 0.6903 0.5
0.6874 50.0 50 0.6902 0.5
0.6874 51.0 51 0.6901 0.5
0.6874 52.0 52 0.6899 0.5
0.6874 53.0 53 0.6898 0.5
0.6874 54.0 54 0.6896 0.5
0.6874 55.0 55 0.6895 0.5
0.6874 56.0 56 0.6894 0.5
0.6874 57.0 57 0.6893 0.5
0.6874 58.0 58 0.6892 0.5
0.6874 59.0 59 0.6890 0.5
0.6878 60.0 60 0.6889 0.5
0.6878 61.0 61 0.6888 0.5
0.6878 62.0 62 0.6886 0.5
0.6878 63.0 63 0.6885 0.5
0.6878 64.0 64 0.6884 0.5
0.6878 65.0 65 0.6884 0.5
0.6878 66.0 66 0.6883 0.5
0.6878 67.0 67 0.6882 0.5
0.6878 68.0 68 0.6882 0.5
0.6878 69.0 69 0.6881 0.5
0.6805 70.0 70 0.6880 0.5312
0.6805 71.0 71 0.6878 0.5312
0.6805 72.0 72 0.6877 0.5312
0.6805 73.0 73 0.6874 0.5312
0.6805 74.0 74 0.6872 0.5312
0.6805 75.0 75 0.6870 0.5312
0.6805 76.0 76 0.6868 0.5312
0.6805 77.0 77 0.6865 0.5312
0.6805 78.0 78 0.6862 0.5
0.6805 79.0 79 0.6860 0.5
0.6675 80.0 80 0.6857 0.5
0.6675 81.0 81 0.6853 0.5312
0.6675 82.0 82 0.6849 0.5312
0.6675 83.0 83 0.6845 0.5312
0.6675 84.0 84 0.6840 0.5312
0.6675 85.0 85 0.6834 0.5625
0.6675 86.0 86 0.6827 0.5625
0.6675 87.0 87 0.6818 0.5625
0.6675 88.0 88 0.6809 0.5625
0.6675 89.0 89 0.6798 0.5625
0.65 90.0 90 0.6786 0.5625
0.65 91.0 91 0.6772 0.5625
0.65 92.0 92 0.6758 0.5625
0.65 93.0 93 0.6741 0.5625
0.65 94.0 94 0.6718 0.5625
0.65 95.0 95 0.6687 0.5625
0.65 96.0 96 0.6649 0.5625
0.65 97.0 97 0.6615 0.5625
0.65 98.0 98 0.6596 0.5625
0.65 99.0 99 0.6605 0.5625
0.611 100.0 100 0.6642 0.5625
0.611 101.0 101 0.6683 0.5625
0.611 102.0 102 0.6689 0.5625
0.611 103.0 103 0.6670 0.5625
0.611 104.0 104 0.6627 0.5312
0.611 105.0 105 0.6595 0.5312
0.611 106.0 106 0.6577 0.5625
0.611 107.0 107 0.6575 0.5938
0.611 108.0 108 0.6552 0.5938
0.611 109.0 109 0.6555 0.625
0.5787 110.0 110 0.6560 0.625
0.5787 111.0 111 0.6566 0.625
0.5787 112.0 112 0.6560 0.625
0.5787 113.0 113 0.6543 0.6562
0.5787 114.0 114 0.6530 0.6562
0.5787 115.0 115 0.6518 0.6562
0.5787 116.0 116 0.6512 0.6562
0.5787 117.0 117 0.6506 0.6562
0.5787 118.0 118 0.6500 0.6562
0.5787 119.0 119 0.6499 0.6875
0.5279 120.0 120 0.6497 0.6875
0.5279 121.0 121 0.6496 0.6875
0.5279 122.0 122 0.6494 0.6875
0.5279 123.0 123 0.6486 0.6875
0.5279 124.0 124 0.6472 0.6875
0.5279 125.0 125 0.6443 0.6875
0.5279 126.0 126 0.6397 0.6562
0.5279 127.0 127 0.6328 0.6562
0.5279 128.0 128 0.6238 0.6875
0.5279 129.0 129 0.6173 0.6875
0.4721 130.0 130 0.6138 0.6875
0.4721 131.0 131 0.6175 0.625
0.4721 132.0 132 0.6137 0.6562
0.4721 133.0 133 0.6101 0.6562
0.4721 134.0 134 0.6062 0.6562
0.4721 135.0 135 0.6027 0.6562
0.4721 136.0 136 0.6015 0.625
0.4721 137.0 137 0.5982 0.625
0.4721 138.0 138 0.6102 0.625
0.4721 139.0 139 0.5983 0.625
0.378 140.0 140 0.6020 0.625
0.378 141.0 141 0.5921 0.625
0.378 142.0 142 0.5790 0.625
0.378 143.0 143 0.5654 0.6562
0.378 144.0 144 0.5493 0.6562
0.378 145.0 145 0.5279 0.6562
0.378 146.0 146 0.5064 0.6562
0.378 147.0 147 0.4834 0.6875
0.378 148.0 148 0.4557 0.7188
0.378 149.0 149 0.4318 0.75
0.2537 150.0 150 0.4022 0.7812

Framework versions

  • Transformers 4.32.0.dev0
  • Pytorch 2.0.1+cu118
  • Datasets 2.4.0
  • Tokenizers 0.13.3
Downloads last month
24
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for simonycl/bert-base-uncased-sst-2-16-87

Finetuned
(282)
this model