SMS Spam Classifier — DistilBERT (Group 36, IIT Jodhpur)

Fine-tuned distilbert-base-uncased for binary SMS spam classification. Achieves 99.35% accuracy and 0.9851 F1 Macro on the held-out test set. This is v2 — the best-performing version by validation loss (0.0292).

Developed as part of the MLOps course, PGD AI Program, IIT Jodhpur.

Model Details

Model Description

Base model: distilbert-base-uncased (66M parameters)
Task: Binary text classification — Ham (0) vs Spam (1)
Dataset: UCI SMS Spam Collection (5,159 samples after deduplication)
Architecture: DistilBERT encoder + linear classification head
Framework: PyTorch + Hugging Face Transformers
Training platform: Kaggle (NVIDIA T4 x2 GPU)
Developed by: MLOps Group 36, IIT Jodhpur
Model card authors: G25AIT2032 Duggirala Vnaga Ananth
Contact: g25ait2032@iitj.ac.in

Related Resources

Resource	Link
GitHub Repository	MLOps Group 36 Repository
Kaggle Notebook (Final)	mlops-group36-final-v3
W&B Dashboard	MLOPS_Group
HF Model — v1	nagaananth/MLOPS_group-v1
HF Model — v2 ★ Best	nagaananth/MLOPS_group-v2
HF Model — v3	nagaananth/MLOPS_group-v3
HF Model — v4	nagaananth/MLOPS_group-v4
Docker Image (GHCR)	`ghcr.io/g25ait2032-prog/mlops_group-inference:latest`
Docker Image (Hub)	`dvnananth/mlops-group36:v1`

How to Get Started

from transformers import pipeline

classifier = pipeline(
    "text-classification",
    model="nagaananth/MLOPS_group-v2"
)

# Spam example
print(classifier("URGENT! You have won a free iPhone. Click here now."))
# [{'label': 'spam', 'score': 0.9804}]

# Ham example
print(classifier("Hey, are we still meeting for lunch at 12?"))
# [{'label': 'ham', 'score': 0.9982}]

Or with full control:

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model_name = "nagaananth/MLOPS_group-v2"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
model.eval()

def predict(text):
    inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)
    with torch.no_grad():
        logits = model(**inputs).logits
    probs = torch.softmax(logits, dim=-1)[0]
    pred_idx = probs.argmax().item()
    return {
        "label": model.config.id2label[pred_idx],
        "score": round(probs[pred_idx].item(), 4)
    }

print(predict("Free prize! Click now to claim your reward."))
# {'label': 'spam', 'score': 0.9897}

Training Details

Dataset

UCI SMS Spam Collection loaded via HuggingFace datasets (sms_spam).

Split	Samples	Ham %	Spam %
Train (70%)	3,611	~87.5	~12.5
Validation (15%)	774	~87.5	~12.5
Test (15%)	774	~87.5	~12.5

Preprocessing steps:

Lowercased and whitespace normalised
415 duplicate messages removed (total: 5,159 unique samples)
Stratified 70/15/15 split with zero-leakage verification
Tokenized with AutoTokenizer for DistilBERT (truncation=True, max_length=128)
Labels mapped: {"ham": 0, "spam": 1}

Hyperparameter Comparison (All Versions)

Version	LR	Epochs	Batch Size	Warmup	Weight Decay	Early Stopping	Val Loss	F1 Macro
v1	3e-5	3	16	100	0.01	No	0.0539	0.9849
v2 ★	2e-5	5	32	200	0.01	Yes (p=2)	0.0292	0.9851
v3	2e-5	5	32	200	0.01	Yes (p=2)	0.0376	0.9851
v4	1e-5	4	16	200	0.02	Yes (p=2)	—	—

v2 was selected as the final deployment model due to its lowest validation loss (0.0292), indicating the best generalisation.

Training Configuration (v2)

Optimizer: AdamW
Learning rate: 2e-5
Epochs: 5 (with early stopping, patience=2)
Batch size: 32 (train), 64 (eval)
Mixed precision: fp16
Metric for best model: F1 Weighted
Infrastructure: Kaggle NVIDIA T4 x2 GPU
Average training time: ~2 minutes per run

Evaluation Results

Test Set Performance (v2 — Best Model)

Metric	Score
Accuracy	0.9935
F1 Weighted	0.9935
F1 Macro	0.9851
Precision	0.9935
Recall	0.9935
Validation Loss	0.0292

Adversarial Test Cases

The model was evaluated on 15 adversarial/edge-case SMS messages covering spam, ham, and ambiguous phrasing (e.g., messages mixing casual language with spam triggers). Representative examples:

Text	True	Predicted	Confidence
"URGENT! You have won a 1-week cruise! Call now."	spam	spam	0.9987
"You won! Click here to claim your prize."	spam	spam	0.9945
"Hey, are we still meeting for lunch at 12?"	ham	ham	0.9991
"Can you send me the report by EOD?"	ham	ham	0.9988
"Meeting for lunch? I won a contest, let's talk."	ham	ham	0.9756

Inference Latency (CPU)

Mean latency: ~30–60 ms per sample
Suitable for CPU-only deployment

Uses

Direct Use

Binary classification of SMS or short-text messages into ham (legitimate) or spam (unsolicited/phishing). Can be directly integrated into messaging applications or notification pipelines.

Downstream Use

Can serve as a component in broader security pipelines for filtering suspicious incoming messages, or as a baseline for transfer learning to other spam-detection domains.

Out-of-Scope Use

Long-form document classification
Sentiment analysis or intent detection
Legal or financial decision-making without human oversight
Languages other than English

Bias, Risks, and Limitations

Data Bias: Trained on a specific SMS corpus from the early 2010s. May struggle with modern slang, emojis, or evolved phishing techniques not present in the training data.

False Positives: Messages containing spam-adjacent keywords (e.g., "Urgent", "Click", "Won") in legitimate contexts may be misclassified.

Contextual Blindness: Processes each message independently; cannot use conversational context from prior messages.

Phishing Sophistication: Less reliable against highly sophisticated spear-phishing that mimics professional language.

Recommendations

Notify users when a message is flagged automatically.
Provide a manual override/report mechanism for misclassifications.
Monitor for distribution drift and retrain periodically on newer data.

Technical Specifications

Model Architecture

Base: distilbert-base-uncased (6 transformer layers, 768 hidden dim, 12 attention heads)
Classification head: Linear layer over [CLS] token pooled output → 2 classes
Total parameters: ~66M

Compute Infrastructure

Training: Kaggle Notebooks — NVIDIA T4 x2 GPU
Libraries: transformers, datasets, evaluate, accelerate, torch, wandb
Inference: CPU-compatible (no GPU required)

Environmental Impact

Hardware: NVIDIA T4 GPU (Kaggle)
Training duration: ~2 minutes per run
Carbon emitted: < 0.01 kg CO₂eq (estimated via ML Impact Calculator)

Citation

@misc{group36-sms-spam-2026,
  author    = {Duggirala Vnaga Ananth and Anukumar K and Shrikrishna Tripathi and Sudeb Ghosh},
  title     = {SMS Spam Classifier: Fine-tuned DistilBERT (Group 36, IIT Jodhpur)},
  year      = {2026},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/nagaananth/MLOPS_group-v2}}
}

Glossary

Ham: Legitimate, non-spam SMS message
Spam: Unsolicited commercial or phishing message
DistilBERT: Distilled version of BERT — 40% smaller, retains 97% of BERT's NLU performance
F1 Macro: Unweighted mean of per-class F1 scores; useful for evaluating imbalanced datasets
Fine-tuning: Adapting a pre-trained language model to a task-specific dataset with supervised training

Downloads last month: 54

Safetensors

Model size

67M params

Tensor type

F32

nagaananth
/

MLOPS_group-v4