license: apache-2.0
datasets:
- sentiment140
language:
- en
library_name: transformers
pipeline_tag: text-classification
widget:
- text: I liked this movie
output:
- label: PROBABILITY POSITIVE
score: 0.8
Model Description
TweeBERTa is a fine-tuned version of the RoBERTa base model, specifically tailored for sentiment analysis tasks. This model has been trained on the Sentiment140 dataset, making it highly effective in understanding and categorizing sentiments expressed in text, particularly within the context of social media.
Training and Evaluation
Training Data
The model was trained on the Sentiment140 dataset, which is a popular dataset for sentiment analysis, especially in the context of tweets.
Training Procedure
- Loss Function: Binary Cross Entropy Loss
- Optimizer: Adam Optimizer
- Learning Rate Schedule: Linear decrease, starting at 1e-5 and ending at 1e-7
- Epochs: The model was trained for a total of 10 epochs, split into two cycles of 5 epochs each, with the same learning rate cycle for both.
Performance
The model achieved the following metrics on the evaluation set:
- Precision: 0.8328
- Recall: 0.8687
- F1 Score: 0.8504
- Accuracy: 0.8471
How to Use
This model is ideal for sentiment analysis tasks, particularly in the context of social media and short text snippets. It can be used directly through the transformers library. An example usage is provided in the widget section of this card.
Limitations and Bias
While the model shows high performance on the Sentiment140 dataset, it may not generalize as well to texts from different domains or those that contain complex or subtle expressions of sentiment. Users should also be aware of potential biases inherent in the training data, which may be reflected in the model's predictions.
Ethical Considerations
This model should be used responsibly, considering potential biases and the impact of automated sentiment analysis in various applications, particularly those affecting human decision-making.
Acknowledgements
This model was fine-tuned and evaluated by Aditya Patkar. The base RoBERTa model and the Sentiment140 dataset were important in developing this model. The training notebook along with a comprehensive comparitive analysis of different models on Sentiment140 dataset can be found at https://github.com/adityapatkar/SentimentSifter.