roberta_emo / README.md
librarian-bot's picture
Librarian Bot: Add base_model information to model
15d1569
|
raw
history blame
3.6 kB
metadata
license: mit
tags:
  - generated_from_trainer
base_model: ibm/ColD-Fusion
model-index:
  - name: roberta_emo
    results: []

roberta_emo

This model is a fine-tuned version of ibm/ColD-Fusion on an unknown dataset.

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 1.0

Framework versions

  • Transformers 4.25.1
  • Pytorch 1.13.1
  • Datasets 2.8.0
  • Tokenizers 0.13.2

Model Recycling

Evaluation on 36 datasets using gustavecortal/roberta_emo as a base model yields average score of 78.47 in comparison to 76.22 by roberta-base.

The model is ranked 2nd among all tested models for the roberta-base architecture as of 18/01/2023 Results:

20_newsgroup ag_news amazon_reviews_multi anli boolq cb cola copa dbpedia esnli financial_phrasebank imdb isear mnli mrpc multirc poem_sentiment qnli qqp rotten_tomatoes rte sst2 sst_5bins stsb trec_coarse trec_fine tweet_ev_emoji tweet_ev_emotion tweet_ev_hate tweet_ev_irony tweet_ev_offensive tweet_ev_sentiment wic wnli wsc yahoo_answers
85.8205 90.2333 66.08 52.1563 81.6208 89.2857 83.4132 71 77.5 90.6963 86.1 93.776 73.0117 86.8186 88.2353 64.0677 88.4615 92.8794 90.9523 91.3696 83.3935 95.7569 57.4661 91.5106 97.2 91.2 45.994 82.4771 52.4916 75.6378 86.6279 70.8727 68.4953 46.4789 63.4615 72.2667

For more information, see: Model Recycling