SMS Spam Classifier — DistilBERT (Group 36, IIT Jodhpur)

Fine-tuned distilbert-base-uncased for binary SMS spam classification. Achieves 99.35% accuracy and 0.9851 F1 Macro on the held-out test set. This is v2 — the best-performing version by validation loss (0.0292).

Developed as part of the MLOps course, PGD AI Program, IIT Jodhpur.


Model Details

Model Description

  • Base model: distilbert-base-uncased (66M parameters)
  • Task: Binary text classification — Ham (0) vs Spam (1)
  • Dataset: UCI SMS Spam Collection (5,159 samples after deduplication)
  • Architecture: DistilBERT encoder + linear classification head
  • Framework: PyTorch + Hugging Face Transformers
  • Training platform: Kaggle (NVIDIA T4 x2 GPU)
  • Developed by: MLOps Group 36, IIT Jodhpur
  • Model card authors: G25AIT2032 Duggirala Vnaga Ananth
  • Contact: g25ait2032@iitj.ac.in

Related Resources

Resource Link
GitHub Repository MLOps Group 36 Repository
Kaggle Notebook (Final) mlops-group36-final-v3
W&B Dashboard MLOPS_Group
HF Model — v1 nagaananth/MLOPS_group-v1
HF Model — v2 ★ Best nagaananth/MLOPS_group-v2
HF Model — v3 nagaananth/MLOPS_group-v3
HF Model — v4 nagaananth/MLOPS_group-v4
Docker Image (GHCR) ghcr.io/g25ait2032-prog/mlops_group-inference:latest
Docker Image (Hub) dvnananth/mlops-group36:v1

How to Get Started

from transformers import pipeline

classifier = pipeline(
    "text-classification",
    model="nagaananth/MLOPS_group-v2"
)

# Spam example
print(classifier("URGENT! You have won a free iPhone. Click here now."))
# [{'label': 'spam', 'score': 0.9804}]

# Ham example
print(classifier("Hey, are we still meeting for lunch at 12?"))
# [{'label': 'ham', 'score': 0.9982}]

Or with full control:

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model_name = "nagaananth/MLOPS_group-v2"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
model.eval()

def predict(text):
    inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)
    with torch.no_grad():
        logits = model(**inputs).logits
    probs = torch.softmax(logits, dim=-1)[0]
    pred_idx = probs.argmax().item()
    return {
        "label": model.config.id2label[pred_idx],
        "score": round(probs[pred_idx].item(), 4)
    }

print(predict("Free prize! Click now to claim your reward."))
# {'label': 'spam', 'score': 0.9897}

Training Details

Dataset

UCI SMS Spam Collection loaded via HuggingFace datasets (sms_spam).

Split Samples Ham % Spam %
Train (70%) 3,611 ~87.5 ~12.5
Validation (15%) 774 ~87.5 ~12.5
Test (15%) 774 ~87.5 ~12.5

Preprocessing steps:

  • Lowercased and whitespace normalised
  • 415 duplicate messages removed (total: 5,159 unique samples)
  • Stratified 70/15/15 split with zero-leakage verification
  • Tokenized with AutoTokenizer for DistilBERT (truncation=True, max_length=128)
  • Labels mapped: {"ham": 0, "spam": 1}

Hyperparameter Comparison (All Versions)

Version LR Epochs Batch Size Warmup Weight Decay Early Stopping Val Loss F1 Macro
v1 3e-5 3 16 100 0.01 No 0.0539 0.9849
v2 ★ 2e-5 5 32 200 0.01 Yes (p=2) 0.0292 0.9851
v3 2e-5 5 32 200 0.01 Yes (p=2) 0.0376 0.9851
v4 1e-5 4 16 200 0.02 Yes (p=2) — —

v2 was selected as the final deployment model due to its lowest validation loss (0.0292), indicating the best generalisation.

Training Configuration (v2)

  • Optimizer: AdamW
  • Learning rate: 2e-5
  • Epochs: 5 (with early stopping, patience=2)
  • Batch size: 32 (train), 64 (eval)
  • Mixed precision: fp16
  • Metric for best model: F1 Weighted
  • Infrastructure: Kaggle NVIDIA T4 x2 GPU
  • Average training time: ~2 minutes per run

Evaluation Results

Test Set Performance (v2 — Best Model)

Metric Score
Accuracy 0.9935
F1 Weighted 0.9935
F1 Macro 0.9851
Precision 0.9935
Recall 0.9935
Validation Loss 0.0292

Adversarial Test Cases

The model was evaluated on 15 adversarial/edge-case SMS messages covering spam, ham, and ambiguous phrasing (e.g., messages mixing casual language with spam triggers). Representative examples:

Text True Predicted Confidence
"URGENT! You have won a 1-week cruise! Call now." spam spam 0.9987
"You won! Click here to claim your prize." spam spam 0.9945
"Hey, are we still meeting for lunch at 12?" ham ham 0.9991
"Can you send me the report by EOD?" ham ham 0.9988
"Meeting for lunch? I won a contest, let's talk." ham ham 0.9756

Inference Latency (CPU)

  • Mean latency: ~30–60 ms per sample
  • Suitable for CPU-only deployment

Uses

Direct Use

Binary classification of SMS or short-text messages into ham (legitimate) or spam (unsolicited/phishing). Can be directly integrated into messaging applications or notification pipelines.

Downstream Use

Can serve as a component in broader security pipelines for filtering suspicious incoming messages, or as a baseline for transfer learning to other spam-detection domains.

Out-of-Scope Use

  • Long-form document classification
  • Sentiment analysis or intent detection
  • Legal or financial decision-making without human oversight
  • Languages other than English

Bias, Risks, and Limitations

Data Bias: Trained on a specific SMS corpus from the early 2010s. May struggle with modern slang, emojis, or evolved phishing techniques not present in the training data.

False Positives: Messages containing spam-adjacent keywords (e.g., "Urgent", "Click", "Won") in legitimate contexts may be misclassified.

Contextual Blindness: Processes each message independently; cannot use conversational context from prior messages.

Phishing Sophistication: Less reliable against highly sophisticated spear-phishing that mimics professional language.

Recommendations

  • Notify users when a message is flagged automatically.
  • Provide a manual override/report mechanism for misclassifications.
  • Monitor for distribution drift and retrain periodically on newer data.

Technical Specifications

Model Architecture

  • Base: distilbert-base-uncased (6 transformer layers, 768 hidden dim, 12 attention heads)
  • Classification head: Linear layer over [CLS] token pooled output → 2 classes
  • Total parameters: ~66M

Compute Infrastructure

  • Training: Kaggle Notebooks — NVIDIA T4 x2 GPU
  • Libraries: transformers, datasets, evaluate, accelerate, torch, wandb
  • Inference: CPU-compatible (no GPU required)

Environmental Impact

  • Hardware: NVIDIA T4 GPU (Kaggle)
  • Training duration: ~2 minutes per run
  • Carbon emitted: < 0.01 kg COâ‚‚eq (estimated via ML Impact Calculator)

Citation

@misc{group36-sms-spam-2026,
  author    = {Duggirala Vnaga Ananth and Anukumar K and Shrikrishna Tripathi and Sudeb Ghosh},
  title     = {SMS Spam Classifier: Fine-tuned DistilBERT (Group 36, IIT Jodhpur)},
  year      = {2026},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/nagaananth/MLOPS_group-v2}}
}

Glossary

  • Ham: Legitimate, non-spam SMS message
  • Spam: Unsolicited commercial or phishing message
  • DistilBERT: Distilled version of BERT — 40% smaller, retains 97% of BERT's NLU performance
  • F1 Macro: Unweighted mean of per-class F1 scores; useful for evaluating imbalanced datasets
  • Fine-tuning: Adapting a pre-trained language model to a task-specific dataset with supervised training
Downloads last month
54
Safetensors
Model size
67M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train nagaananth/MLOPS_group-v4