simonycl's picture
update model card README.md
92dd863
|
raw
history blame
6.03 kB
metadata
license: mit
base_model: roberta-large
tags:
  - generated_from_trainer
metrics:
  - accuracy
model-index:
  - name: roberta-large-sst-2-16-13-smoothed
    results: []

roberta-large-sst-2-16-13-smoothed

This model is a fine-tuned version of roberta-large on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6487
  • Accuracy: 0.75

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 50
  • num_epochs: 75
  • label_smoothing_factor: 0.45

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 1 0.7106 0.5
No log 2.0 2 0.7104 0.5
No log 3.0 3 0.7100 0.5
No log 4.0 4 0.7094 0.5
No log 5.0 5 0.7087 0.5
No log 6.0 6 0.7077 0.5
No log 7.0 7 0.7066 0.5
No log 8.0 8 0.7054 0.5
No log 9.0 9 0.7040 0.5
0.7172 10.0 10 0.7026 0.5
0.7172 11.0 11 0.7011 0.5
0.7172 12.0 12 0.6995 0.5
0.7172 13.0 13 0.6980 0.5
0.7172 14.0 14 0.6965 0.5312
0.7172 15.0 15 0.6951 0.5312
0.7172 16.0 16 0.6936 0.5312
0.7172 17.0 17 0.6921 0.5312
0.7172 18.0 18 0.6906 0.5312
0.7172 19.0 19 0.6895 0.5312
0.6997 20.0 20 0.6884 0.5312
0.6997 21.0 21 0.6874 0.5312
0.6997 22.0 22 0.6867 0.5625
0.6997 23.0 23 0.6860 0.5312
0.6997 24.0 24 0.6854 0.5938
0.6997 25.0 25 0.6846 0.6562
0.6997 26.0 26 0.6840 0.625
0.6997 27.0 27 0.6832 0.6562
0.6997 28.0 28 0.6826 0.6875
0.6997 29.0 29 0.6815 0.6875
0.6874 30.0 30 0.6804 0.6875
0.6874 31.0 31 0.6790 0.6875
0.6874 32.0 32 0.6772 0.6875
0.6874 33.0 33 0.6762 0.6562
0.6874 34.0 34 0.6753 0.6562
0.6874 35.0 35 0.6738 0.6875
0.6874 36.0 36 0.6725 0.6875
0.6874 37.0 37 0.6696 0.6875
0.6874 38.0 38 0.6687 0.6875
0.6874 39.0 39 0.6665 0.6875
0.6594 40.0 40 0.6643 0.6875
0.6594 41.0 41 0.6674 0.6875
0.6594 42.0 42 0.6733 0.6875
0.6594 43.0 43 0.6804 0.6875
0.6594 44.0 44 0.6731 0.6875
0.6594 45.0 45 0.6701 0.6875
0.6594 46.0 46 0.6687 0.6875
0.6594 47.0 47 0.6687 0.6562
0.6594 48.0 48 0.6757 0.625
0.6594 49.0 49 0.6739 0.6875
0.6089 50.0 50 0.6766 0.6875
0.6089 51.0 51 0.6724 0.6875
0.6089 52.0 52 0.6662 0.6875
0.6089 53.0 53 0.6664 0.6875
0.6089 54.0 54 0.6602 0.6875
0.6089 55.0 55 0.6505 0.6875
0.6089 56.0 56 0.6468 0.75
0.6089 57.0 57 0.6370 0.75
0.6089 58.0 58 0.6285 0.7812
0.6089 59.0 59 0.6267 0.7812
0.5694 60.0 60 0.6279 0.7812
0.5694 61.0 61 0.6364 0.7812
0.5694 62.0 62 0.6443 0.75
0.5694 63.0 63 0.6518 0.7812
0.5694 64.0 64 0.6634 0.7188
0.5694 65.0 65 0.6647 0.7188
0.5694 66.0 66 0.6679 0.7188
0.5694 67.0 67 0.6669 0.7188
0.5694 68.0 68 0.6626 0.7188
0.5694 69.0 69 0.6624 0.75
0.5618 70.0 70 0.6614 0.7188
0.5618 71.0 71 0.6592 0.75
0.5618 72.0 72 0.6571 0.75
0.5618 73.0 73 0.6541 0.75
0.5618 74.0 74 0.6499 0.75
0.5618 75.0 75 0.6487 0.75

Framework versions

  • Transformers 4.32.0.dev0
  • Pytorch 2.0.1+cu118
  • Datasets 2.4.0
  • Tokenizers 0.13.3