Model Card for Fine-tuned DistilBERT-SST2 with Yelp Polarity

Model Description

This model is a fine-tuned version of distilbert-base-uncased, a distilled version of BERT optimized for efficiency. It has been initially trained on the Stanford Sentiment Treebank (SST-2) dataset and further fine-tuned on the Yelp Polarity dataset for improved sentiment classification performance. The model classifies English text into two categories: positive and negative sentiment.

DistilBERT-SST2-Yelp is lightweight, fast, and ideal for sentiment analysis tasks on short texts such as customer reviews, product feedback, and social media posts.

Intended Uses & Limitations

Intended Uses:

Sentiment analysis on short English texts, including:
- Reviews (e.g., product, restaurant, movie, etc.)
- Comments
- Tweets or other social media posts
Applications requiring efficient, low-latency inference for real-time analysis.

Limitations:

Domain Specificity: Fine-tuned on SST-2 and Yelp Polarity, so it may not generalize well to highly specific or niche domains.
Context Length: Optimized for short texts and may perform poorly with long-form inputs.
Language Support: Works only for English text.
Biases: May inherit biases present in the datasets, including biases related to language usage in sentiment analysis tasks.

How to Use

from transformers import AutoTokenizer, AutoModelForSequenceClassification

# Load the fine-tuned model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("./DistilBERT-SST2-Yelp")
model = AutoModelForSequenceClassification.from_pretrained("./DistilBERT-SST2-Yelp")

# Example input
text = "This movie was fantastic!"

# Tokenize and predict
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
outputs = model(**inputs)
predicted_class = outputs.logits.argmax(-1).item()

# Class mapping: 0 -> Negative, 1 -> Positive
print("Predicted Sentiment:", "Positive" if predicted_class == 1 else "Negative")

Limitations and Bias

The SST-2 and Yelp Polarity datasets may reflect cultural, contextual, or domain-specific biases in sentiment interpretation.
Over-reliance on specific patterns or keywords from the training data may lead to incorrect classifications, especially in nuanced or ambiguous cases.
The model is not suitable for multilingual sentiment analysis or for detecting sentiment in specialized fields (e.g., legal, medical).

Training Data

SST-2: The Stanford Sentiment Treebank (SST-2) dataset, containing movie reviews labeled as positive or negative.
Yelp Polarity: A dataset of customer reviews from Yelp, labeled as positive or negative.

The model is fine-tuned on both datasets to improve its performance on a variety of sentiment classification tasks.

Training Procedure

Base Model: distilbert-base-uncased
Framework: Hugging Face Transformers
Optimizer: AdamW with weight decay
Learning Rate: 2e-5
Batch Size: 32 (effective, using gradient accumulation)
Epochs: 3
Evaluation Strategy: Per epoch
Hardware: NVIDIA RTX 4060 with CUDA support

Optimizations:

Mixed precision (fp16) for faster training and reduced memory usage.
Gradient accumulation for simulating larger batch sizes.
Learning rate warmup and weight decay for stable convergence.

Evaluation Results

Dataset Split	Accuracy
Train (SST-2)	98.5%
Test (SST-2)	94.7%
Train (Yelp)	93.5%
Test (Yelp)	92.0%

Evaluation Metric: Accuracy, computed using the Hugging Face evaluate library.

Future Work

Fine-tune on more diverse datasets, including domain-specific datasets for enhanced performance in other areas.
Extend support to multilingual sentiment analysis.
Improve efficiency for deployment through techniques such as pruning, quantization, or distillation.

License

The model is shared under the Apache 2.0 License.

AirrStorm
/

DistilBERT-SST2-Yelp