Model Card for Sentiment Analysis on Primate Dataset
This model card provides details about a sentiment analysis model trained on a dataset containing posts related to primates. The model predicts sentiment labels for textual data using transformer-based architectures.
Model Details
Model Description
The sentiment analysis model aims to classify text data into sentiment categories such as positive, negative, or neutral. It utilizes transformer-based architectures for sequence classification.
- Developed by: Jaskaran Singh
- Model type: Transformer-based sentiment analysis model
- Language(s) (NLP): English
- License: MIT
- Finetuned from model: Transformer-based pre-trained model
Model Sources
- Repository: https://github.com/JaskaranSingh-01/Sentiment_Analyzer
- Demo: https://sentimentanalyzer-f76oxwautwypxpea4lj3wg.streamlit.app/
Uses
Direct Use
The model can be directly used for sentiment analysis tasks, particularly on textual data related to primates.
Downstream Use
The model can be fine-tuned for specific downstream tasks or integrated into larger applications requiring sentiment analysis functionality.
Bias, Risks, and Limitations
Bias
The model's predictions may reflect biases present in the training data, including any biases related to primates or sentiment labeling.
Risks
- Misclassification: The model may misclassify sentiment due to ambiguity or complexity in the text.
- Generalization: The model's performance may vary across different domains or datasets.
Limitations
- Limited Domain: The model's effectiveness may be limited to text related to primates.
- Cultural Bias: The model's performance may be influenced by cultural nuances present in the training data.
Recommendations
Users should be cautious when interpreting the model's predictions, considering potential biases and limitations. Fine-tuning on domain-specific data or applying post-processing techniques may help mitigate biases and improve performance.
How to Get Started with the Model
# Example code for using the sentiment analysis model
# 1. Load the model and tokenizer
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("sbcBI/sentiment_analysis_model")
model = AutoModelForSequenceClassification.from_pretrained("sbcBI/sentiment_analysis_model")
# 2. Tokenize input text
text = "Sample text for sentiment analysis"
encoded_input = tokenizer(text, return_tensors='pt')
# 3. Perform inference
output = model(**encoded_input)
predicted_label = output.logits.argmax().item()
# 4. Interpret prediction
sentiment_labels = ['Negative', 'Neutral', 'Positive']
print("Predicted Sentiment:", sentiment_labels[predicted_label])
Training Details
Training Data
The training data consists of posts related to primates, annotated with sentiment labels.
Training Procedure
Preprocessing
Text data underwent preprocessing steps including lowercase conversion, punctuation removal, tokenization, stopword removal, and stemming.
Training Hyperparameters
- Training regime: Fine-tuning of transformer-based pre-trained model
- Optimizer: Adam optimizer
- Learning rate: 5e-5
- Batch size: 8
- Epochs: 10
Evaluation
Testing Data, Factors & Metrics
- Testing Data: Holdout test set
- Metrics: Accuracy, Precision, Recall, F1-score
Results
- Accuracy: 0.79
- Precision: 0.74
- Recall: 0.77
- F1-score: 0.75
Environmental Impact
Carbon emissions were not directly measured for model training. However, users should consider the environmental impact of training and deploying machine learning models, especially on large-scale infrastructure.
Technical Specifications
Model Architecture and Objective
The model architecture is based on transformer-based architectures, specifically designed for sequence classification tasks such as sentiment analysis.
Compute Infrastructure
Software
- Framework: PyTorch
- Dependencies: Transformers, NLTK
- Downloads last month
- 4