README.md · adityapatkar/TweeBERTa at main

metadata

license: apache-2.0
datasets:
  - sentiment140
language:
  - en
library_name: transformers
pipeline_tag: text-classification
widget:
  - text: I liked this movie
    output:
      - label: PROBABILITY POSITIVE
        score: 0.8

Model Description

TweeBERTa is a fine-tuned version of the RoBERTa base model, specifically tailored for sentiment analysis tasks. This model has been trained on the Sentiment140 dataset, making it highly effective in understanding and categorizing sentiments expressed in text, particularly within the context of social media.

Training and Evaluation

Training Data

The model was trained on the Sentiment140 dataset, which is a popular dataset for sentiment analysis, especially in the context of tweets.

Training Procedure

Loss Function: Binary Cross Entropy Loss
Optimizer: Adam Optimizer
Learning Rate Schedule: Linear decrease, starting at 1e-5 and ending at 1e-7
Epochs: The model was trained for a total of 10 epochs, split into two cycles of 5 epochs each, with the same learning rate cycle for both.

Performance

The model achieved the following metrics on the evaluation set:

Precision: 0.8328
Recall: 0.8687
F1 Score: 0.8504
Accuracy: 0.8471

How to Use

This model is ideal for sentiment analysis tasks, particularly in the context of social media and short text snippets. It can be used directly through the transformers library. An example usage is provided in the widget section of this card.

Limitations and Bias

While the model shows high performance on the Sentiment140 dataset, it may not generalize as well to texts from different domains or those that contain complex or subtle expressions of sentiment. Users should also be aware of potential biases inherent in the training data, which may be reflected in the model's predictions.

Ethical Considerations

This model should be used responsibly, considering potential biases and the impact of automated sentiment analysis in various applications, particularly those affecting human decision-making.

Acknowledgements

This model was fine-tuned and evaluated by Aditya Patkar. The base RoBERTa model and the Sentiment140 dataset were important in developing this model. The training notebook along with a comprehensive comparitive analysis of different models on Sentiment140 dataset can be found at https://github.com/adityapatkar/SentimentSifter.