YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Attention-based Sentiment Classifier

This repository contains an attention-based sentiment classification model that demonstrates how attention mechanisms can enhance interpretability in NLP tasks.

Attention Visualization Example

Model Overview

This model uses a bidirectional GRU with an attention mechanism to classify text sentiment (positive/negative). The attention mechanism allows the model to focus on the most relevant parts of the input text, providing insight into which words influence the classification the most.

Key Features

  • Bidirectional GRU architecture
  • Additive attention mechanism for interpretability
  • Binary sentiment classification (positive/negative)
  • Visualization tools for attention weights

Quick Start

from transformers import pipeline
import matplotlib.pyplot as plt
import seaborn as sns

# Load model directly from Hugging Face
classifier = pipeline(
    "text-classification",
    model="ericwei/attention-sentiment-classifier"
)

# Standard prediction
result = classifier("I absolutely loved this movie! The acting was superb.")
print(f"Sentiment: {result[0]['label']}, Score: {result[0]['score']:.4f}")

# For attention visualization, use the model directly
from transformers import AutoTokenizer, AutoModel
import torch

tokenizer = AutoTokenizer.from_pretrained("ericwei/attention-sentiment-classifier")
model = AutoModel.from_pretrained("weicwei/attention-sentiment-classifier")

text = "I absolutely loved this movie! The acting was superb."
inputs = tokenizer(text, return_tensors="pt")

# Get prediction with attention weights
model.eval()
with torch.no_grad():
    outputs = model(inputs["input_ids"], return_attention=True, return_dict=True)

# Get prediction results
logits = outputs["logits"]
attention_weights = outputs["attention_weights"]

# Visualize attention
tokens = [tokenizer.convert_ids_to_tokens(id.item()) for id in inputs["input_ids"][0]]

plt.figure(figsize=(10, 2))
sns.heatmap(
    attention_weights.squeeze(0).cpu().numpy().reshape(1, -1),
    cmap="YlOrRd",
    annot=True,
    fmt=".2f",
    cbar=False,
    xticklabels=tokens,
    yticklabels=["Attention"]
)
plt.xticks(rotation=45, ha="right", rotation_mode="anchor")
plt.title("Attention Weights Visualization")
plt.tight_layout()
plt.show()

Demo App

This model includes a Streamlit demo app that can be launched directly on Hugging Face Spaces.

Model Architecture

The model consists of:

  1. Embedding Layer: Converts token IDs to dense vectors
  2. Bidirectional GRU: Processes the text in both directions
  3. Attention Mechanism: Focuses on the most relevant parts of the text
  4. Classifier Head: Makes the final sentiment prediction

Training

The model was trained on the SST-2 (Stanford Sentiment Treebank) dataset using the following hyperparameters:

  • Learning rate: 1e-3
  • Epochs: 12
  • Optimizer: Adam
  • Loss function: Cross Entropy Loss
  • Embedding dimension: 100
  • Hidden dimension: 256

Limitations

  • Only trained on movie reviews, may not generalize to other domains
  • Limited to English text
  • Binary classification only (positive/negative)
  • Not suitable for multi-lingual content
  • Performance may degrade on texts significantly different from movie reviews

Citation

If you use this model, please cite:

@misc{attention-sentiment-classifier,
  author = {Lantian Wei},
  title = {Attention-based Sentiment Classifier},
  year = {2025},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/your-username/attention-sentiment-classifier}}
}

License

This model is licensed under the GNU General Public License v3.0.

Downloads last month
2
Safetensors
Model size
4.26M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support