YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Attention-based Sentiment Classifier

This repository contains an attention-based sentiment classification model that demonstrates how attention mechanisms can enhance interpretability in NLP tasks.

Model Overview

This model uses a bidirectional GRU with an attention mechanism to classify text sentiment (positive/negative). The attention mechanism allows the model to focus on the most relevant parts of the input text, providing insight into which words influence the classification the most.

Key Features

Bidirectional GRU architecture
Additive attention mechanism for interpretability
Binary sentiment classification (positive/negative)
Visualization tools for attention weights

Quick Start

from transformers import pipeline
import matplotlib.pyplot as plt
import seaborn as sns

# Load model directly from Hugging Face
classifier = pipeline(
    "text-classification",
    model="ericwei/attention-sentiment-classifier"
)

# Standard prediction
result = classifier("I absolutely loved this movie! The acting was superb.")
print(f"Sentiment: {result[0]['label']}, Score: {result[0]['score']:.4f}")

# For attention visualization, use the model directly
from transformers import AutoTokenizer, AutoModel
import torch

tokenizer = AutoTokenizer.from_pretrained("ericwei/attention-sentiment-classifier")
model = AutoModel.from_pretrained("weicwei/attention-sentiment-classifier")

text = "I absolutely loved this movie! The acting was superb."
inputs = tokenizer(text, return_tensors="pt")

# Get prediction with attention weights
model.eval()
with torch.no_grad():
    outputs = model(inputs["input_ids"], return_attention=True, return_dict=True)

# Get prediction results
logits = outputs["logits"]
attention_weights = outputs["attention_weights"]

# Visualize attention
tokens = [tokenizer.convert_ids_to_tokens(id.item()) for id in inputs["input_ids"][0]]

plt.figure(figsize=(10, 2))
sns.heatmap(
    attention_weights.squeeze(0).cpu().numpy().reshape(1, -1),
    cmap="YlOrRd",
    annot=True,
    fmt=".2f",
    cbar=False,
    xticklabels=tokens,
    yticklabels=["Attention"]
)
plt.xticks(rotation=45, ha="right", rotation_mode="anchor")
plt.title("Attention Weights Visualization")
plt.tight_layout()
plt.show()

Demo App

This model includes a Streamlit demo app that can be launched directly on Hugging Face Spaces.

Model Architecture

The model consists of:

Embedding Layer: Converts token IDs to dense vectors
Bidirectional GRU: Processes the text in both directions
Attention Mechanism: Focuses on the most relevant parts of the text
Classifier Head: Makes the final sentiment prediction

Training

The model was trained on the SST-2 (Stanford Sentiment Treebank) dataset using the following hyperparameters:

Learning rate: 1e-3
Epochs: 12
Optimizer: Adam
Loss function: Cross Entropy Loss
Embedding dimension: 100
Hidden dimension: 256

Limitations

Only trained on movie reviews, may not generalize to other domains
Limited to English text
Binary classification only (positive/negative)
Not suitable for multi-lingual content
Performance may degrade on texts significantly different from movie reviews

Citation

If you use this model, please cite:

@misc{attention-sentiment-classifier,
  author = {Lantian Wei},
  title = {Attention-based Sentiment Classifier},
  year = {2025},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/your-username/attention-sentiment-classifier}}
}

License

This model is licensed under the GNU General Public License v3.0.

Downloads last month: 2

Safetensors

Model size

4.26M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support