File size: 3,694 Bytes
f3ff1d7 c966397 f3ff1d7 850ae65 f3ff1d7 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 |
# Attention-based Sentiment Classifier
This repository contains an attention-based sentiment classification model that demonstrates how attention mechanisms can enhance interpretability in NLP tasks.

## Model Overview
This model uses a bidirectional GRU with an attention mechanism to classify text sentiment (positive/negative). The attention mechanism allows the model to focus on the most relevant parts of the input text, providing insight into which words influence the classification the most.
### Key Features
- Bidirectional GRU architecture
- Additive attention mechanism for interpretability
- Binary sentiment classification (positive/negative)
- Visualization tools for attention weights
## Quick Start
```python
from transformers import pipeline
import matplotlib.pyplot as plt
import seaborn as sns
# Load model directly from Hugging Face
classifier = pipeline(
"text-classification",
model="ericwei/attention-sentiment-classifier"
)
# Standard prediction
result = classifier("I absolutely loved this movie! The acting was superb.")
print(f"Sentiment: {result[0]['label']}, Score: {result[0]['score']:.4f}")
# For attention visualization, use the model directly
from transformers import AutoTokenizer, AutoModel
import torch
tokenizer = AutoTokenizer.from_pretrained("ericwei/attention-sentiment-classifier")
model = AutoModel.from_pretrained("weicwei/attention-sentiment-classifier")
text = "I absolutely loved this movie! The acting was superb."
inputs = tokenizer(text, return_tensors="pt")
# Get prediction with attention weights
model.eval()
with torch.no_grad():
outputs = model(inputs["input_ids"], return_attention=True, return_dict=True)
# Get prediction results
logits = outputs["logits"]
attention_weights = outputs["attention_weights"]
# Visualize attention
tokens = [tokenizer.convert_ids_to_tokens(id.item()) for id in inputs["input_ids"][0]]
plt.figure(figsize=(10, 2))
sns.heatmap(
attention_weights.squeeze(0).cpu().numpy().reshape(1, -1),
cmap="YlOrRd",
annot=True,
fmt=".2f",
cbar=False,
xticklabels=tokens,
yticklabels=["Attention"]
)
plt.xticks(rotation=45, ha="right", rotation_mode="anchor")
plt.title("Attention Weights Visualization")
plt.tight_layout()
plt.show()
```
## Demo App
This model includes a Streamlit demo app that can be launched directly on Hugging Face Spaces.
## Model Architecture
The model consists of:
1. **Embedding Layer**: Converts token IDs to dense vectors
2. **Bidirectional GRU**: Processes the text in both directions
3. **Attention Mechanism**: Focuses on the most relevant parts of the text
4. **Classifier Head**: Makes the final sentiment prediction
## Training
The model was trained on the SST-2 (Stanford Sentiment Treebank) dataset using the following hyperparameters:
- Learning rate: 1e-3
- Epochs: 12
- Optimizer: Adam
- Loss function: Cross Entropy Loss
- Embedding dimension: 100
- Hidden dimension: 256
## Limitations
- Only trained on movie reviews, may not generalize to other domains
- Limited to English text
- Binary classification only (positive/negative)
- Not suitable for multi-lingual content
- Performance may degrade on texts significantly different from movie reviews
## Citation
If you use this model, please cite:
```
@misc{attention-sentiment-classifier,
author = {Lantian Wei},
title = {Attention-based Sentiment Classifier},
year = {2025},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/your-username/attention-sentiment-classifier}}
}
```
## License
This model is licensed under the GNU General Public License v3.0.
|