Edit model card

roberta-base-sst-2-64-13-smoothed

This model is a fine-tuned version of roberta-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6071
  • Accuracy: 0.8672

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 50
  • num_epochs: 75
  • label_smoothing_factor: 0.45

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 4 0.6941 0.5
No log 2.0 8 0.6939 0.5
0.694 3.0 12 0.6936 0.5
0.694 4.0 16 0.6932 0.5
0.6948 5.0 20 0.6929 0.5
0.6948 6.0 24 0.6925 0.5
0.6948 7.0 28 0.6922 0.5469
0.6948 8.0 32 0.6919 0.6719
0.6948 9.0 36 0.6914 0.7266
0.6908 10.0 40 0.6907 0.75
0.6908 11.0 44 0.6894 0.6719
0.6908 12.0 48 0.6866 0.6328
0.6835 13.0 52 0.6789 0.7891
0.6835 14.0 56 0.6514 0.8828
0.637 15.0 60 0.6004 0.875
0.637 16.0 64 0.6097 0.8984
0.637 17.0 68 0.6147 0.8516
0.5653 18.0 72 0.5973 0.8672
0.5653 19.0 76 0.6056 0.875
0.544 20.0 80 0.6077 0.875
0.544 21.0 84 0.5947 0.8672
0.544 22.0 88 0.6029 0.8828
0.5384 23.0 92 0.6067 0.8828
0.5384 24.0 96 0.5998 0.8828
0.5361 25.0 100 0.5978 0.8906
0.5361 26.0 104 0.6004 0.875
0.5361 27.0 108 0.6055 0.8672
0.5364 28.0 112 0.6064 0.8672
0.5364 29.0 116 0.5991 0.8906
0.5364 30.0 120 0.5973 0.8906
0.5364 31.0 124 0.6019 0.8828
0.5364 32.0 128 0.6085 0.8594
0.5358 33.0 132 0.6069 0.8672
0.5358 34.0 136 0.6075 0.8594
0.5357 35.0 140 0.6022 0.8828
0.5357 36.0 144 0.5980 0.8906
0.5357 37.0 148 0.5983 0.8984
0.5359 38.0 152 0.5962 0.8984
0.5359 39.0 156 0.5965 0.8984
0.5358 40.0 160 0.6007 0.8984
0.5358 41.0 164 0.6010 0.8984
0.5358 42.0 168 0.5975 0.8984
0.5355 43.0 172 0.5975 0.8906
0.5355 44.0 176 0.6012 0.8906
0.5354 45.0 180 0.6027 0.8828
0.5354 46.0 184 0.6027 0.8828
0.5354 47.0 188 0.6018 0.8828
0.5355 48.0 192 0.6070 0.875
0.5355 49.0 196 0.6090 0.8672
0.5352 50.0 200 0.6090 0.8672
0.5352 51.0 204 0.6079 0.8672
0.5352 52.0 208 0.6072 0.8906
0.5354 53.0 212 0.6063 0.8906
0.5354 54.0 216 0.6045 0.8672
0.5353 55.0 220 0.6094 0.8672
0.5353 56.0 224 0.6167 0.8438
0.5353 57.0 228 0.6176 0.8516
0.5353 58.0 232 0.6188 0.8516
0.5353 59.0 236 0.6204 0.8516
0.5353 60.0 240 0.6218 0.8438
0.5353 61.0 244 0.6222 0.8516
0.5353 62.0 248 0.6208 0.8516
0.5352 63.0 252 0.6194 0.8516
0.5352 64.0 256 0.6167 0.8438
0.5351 65.0 260 0.6144 0.8438
0.5351 66.0 264 0.6128 0.8516
0.5351 67.0 268 0.6117 0.8594
0.5349 68.0 272 0.6112 0.8594
0.5349 69.0 276 0.6114 0.8672
0.5351 70.0 280 0.6089 0.8672
0.5351 71.0 284 0.6077 0.875
0.5351 72.0 288 0.6073 0.875
0.5352 73.0 292 0.6072 0.8672
0.5352 74.0 296 0.6071 0.8672
0.5355 75.0 300 0.6071 0.8672

Framework versions

  • Transformers 4.32.0.dev0
  • Pytorch 2.0.1+cu118
  • Datasets 2.4.0
  • Tokenizers 0.13.3
Downloads last month
1

Finetuned from