🧠 BERT Emotion Classifier

Fine-tuned bert-base-uncased on the dair-ai/emotion dataset for multi-class emotion detection in English text.

Model Description

This model classifies English text into one of 6 emotional categories:

Label Emotion Description
0 😒 sadness Grief, melancholy, sorrow
1 πŸ˜„ joy Happiness, excitement, delight
2 ❀️ love Affection, warmth, attachment
3 😠 anger Frustration, rage, indignation
4 😨 fear Anxiety, dread, apprehension
5 😲 surprise Astonishment, unexpectedness

Performance

Metric Score
Test Accuracy ~93%
Macro F1 ~92%
Weighted F1 ~93%

Usage

from transformers import BertTokenizerFast, BertForSequenceClassification
import torch
import torch.nn.functional as F

# Load model
tokenizer = BertTokenizerFast.from_pretrained("punithreddy-ai/bert-emotion-classifier")
model     = BertForSequenceClassification.from_pretrained("punithreddy-ai/bert-emotion-classifier")
model.eval()

LABELS = ["sadness", "joy", "love", "anger", "fear", "surprise"]
EMOJIS = ["😒", "πŸ˜„", "❀️", "😠", "😨", "😲"]

def predict(text):
    inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)
    with torch.no_grad():
        logits = model(**inputs).logits
    probs = F.softmax(logits, dim=-1)[0]
    idx   = probs.argmax().item()
    return {
        "label":      LABELS[idx],
        "emoji":      EMOJIS[idx],
        "confidence": f"{probs[idx].item():.1%}",
        "all_scores": {LABELS[i]: f"{probs[i].item():.1%}" for i in range(6)}
    }

# Examples
print(predict("I just got promoted β€” I'm over the moon!"))
# {'label': 'joy', 'emoji': 'πŸ˜„', 'confidence': '96.3%', ...}

print(predict("I can't stop crying. I miss him so much."))
# {'label': 'sadness', 'emoji': '😒', 'confidence': '94.1%', ...}

print(predict("How DARE they treat people like that!"))
# {'label': 'anger', 'emoji': '😠', 'confidence': '91.7%', ...}

Using the pipeline API

from transformers import pipeline

classifier = pipeline(
    "text-classification",
    model="punithreddy-ai/bert-emotion-classifier",
    return_all_scores=True,
)

results = classifier("I can't believe we actually won!")
# [{'label': 'joy', 'score': 0.89}, {'label': 'surprise', 'score': 0.07}, ...]

Training Details

Hyperparameter Value
Base model bert-base-uncased
Max sequence length 128 tokens
Batch size 32
Learning rate 2e-5
LR schedule Linear warmup (10%) + linear decay
Epochs 4
Weight decay 0.01
Dropout 0.1
Optimizer AdamW (Ξ΅=1e-8)
Gradient clipping 1.0
Seed 42

Training Data

dair-ai/emotion β€” English Twitter messages labelled with one of six emotions.

Split Samples
Train 16,000
Validation 2,000
Test 2,000

Architecture

Input Text
    ↓
[BertTokenizerFast] β†’ input_ids + attention_mask
    ↓
[bert-base-uncased]
  β€’ 12 transformer encoder layers
  β€’ 12 attention heads  
  β€’ 768 hidden dimensions
  β€’ 110M total parameters
    ↓
[CLS] token β†’ [Dropout 0.1] β†’ [Linear 768β†’6] β†’ [Softmax]
    ↓
Probability distribution over 6 emotions

Limitations

  • Trained on English Twitter data β€” may underperform on formal text, other dialects, or non-English input
  • Short texts (< 5 words) may produce less confident predictions
  • Ambiguous emotions (e.g. bittersweet) may be misclassified as the model assigns a single label
  • Class imbalance in the dataset (joy and sadness dominate) may affect minority class performance

Citation

If you use this model, please cite the original dataset:

@inproceedings{saravia-etal-2018-carer,
    title = "{CARER}: Contextualized Affect Representations for Emotion Recognition",
    author = "Saravia, Elvis and Liu, Hsien-Chi Toby and Huang, Yi-Hsin and Wu, Jing and Chen, Yi-Shin",
    booktitle = "Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing",
    year = "2018",
    publisher = "Association for Computational Linguistics",
}

Author

Built by Punithreddy as a portfolio project demonstrating BERT fine-tuning for NLP classification.

Downloads last month
4
Safetensors
Model size
0.1B params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Dataset used to train punithreddy-ai/bert-emotion-classifier

Evaluation results