Senior Project Notice

This repository was created for a senior project in ENGT 375 Applied Machine Learning at Old Dominion University. It is provided for educational and research demonstration purposes only. It is not intended for production use, security filtering, or making real-world spam/phishing decisions. Always use established security tools for operational email protection.

Spam Email Classifier with XAI Explanations

ENGT 375 β€” Applied Machine Learning | Spring 2026 | Old Dominion University A Gradio web app that classifies emails as spam or ham and provides explainable AI (XAI) insights using three different methods (LIME, SHAP, and ELI5).

Features

  • Paste any email and get an instant spam/ham prediction
  • LIME explanations β€” which words pushed the decision
  • SHAP feature importance β€” game-theoretic attribution
  • ELI5 β€” model internal feature weights and permutation importance
  • Side-by-side comparison of all three XAI methods
  • Plain English summary of why the model made its decision
  • User feedback β€” thumbs up/down to log corrections for batch retraining
  • Adjustable classification threshold

How to Run Locally

# Install dependencies
pip install -r requirements.txt

# Train the model (first run only β€” produces models/voting_model.joblib)
python3 train_ensemble.py

# Launch the Gradio web app
python3 app.py

# Or open the student teaching notebook
jupyter notebook notebooks/spam_classifier_xai_student.ipynb

You can also double-click any of these .command files in Finder:

  • launch-gradio.command β€” opens the Gradio web UI in your browser
  • launch-notebook.command β€” opens the student notebook in Jupyter
  • launch-app.command β€” legacy Streamlit launcher (kept for reference; app.py is now a Gradio app β€” use launch-gradio.command instead)
  • retrain-fast.command β€” quick retrain (~2-5 min, single RF, no grid search)
  • retrain-full.command β€” full retrain (~15-30 min, voting ensemble + grid search)

Retraining

python3 retrain.py --mode fast        # quick retrain, single RF
python3 retrain.py --mode full        # full retrain, voting ensemble + grid search
python3 retrain.py --mode full --no-feedback   # full retrain, ignore user feedback log

The retrain script reads accumulated user corrections from data/feedback/feedback_log.csv and merges them into the training data with 5x weighting.

Model

Voting ensemble (Random Forest + Logistic Regression + Linear SVM with calibration) trained on the Kaggle 100K spam dataset + GitHub email-dataset, using 3,000 TF-IDF features + 24 hand-crafted metadata features.

Model Accuracy F1 Score
Random Forest 97.78% 0.976
Logistic Regression 96.6% 0.964
SVM (LinearSVC + calibration) 96.9% 0.967
VotingClassifier (deployed) 97.4% 0.973

Optimal classification threshold: 0.3714 (targeting 99% ham precision; value read from optimal_threshold.joblib as written by the full-corpus retrain).

Notebooks

Notebook Purpose
notebooks/spam_classifier_xai_student.ipynb Main teaching notebook (turn-in artifact for the course). Full XAI walkthrough with LIME, SHAP, ELI5, and a feature reduction experiment based on Kuzlu et al. 2020
notebooks/spam_classifier_gradio.ipynb Shorter pipeline focused on the ensemble model and Gradio deployment

Documentation

  • docs/references/how-to.html β€” full reference index with clickable links to all local PDFs (LIME, SHAP, TreeSHAP, Kuzlu et al., 5 spam-detection papers) and HTML guides (sklearn user guide, Gradio quickstart, HF Spaces docs, Molnar Interpretable ML book)
  • docs/07-code-sources-reference.md β€” markdown version of the references with citation entries
  • CHANGELOG.md β€” full project history from v0.1 (Streamlit) through v1.1 (merged Gradio)

Tech Stack

  • scikit-learn β€” Random Forest, Logistic Regression, LinearSVC, VotingClassifier, CalibratedClassifierCV, TfidfVectorizer, MinMaxScaler, GridSearchCV, metrics
  • LIME + SHAP + ELI5 β€” explainability
  • Gradio β€” web interface (live deployment on HuggingFace Spaces)
  • NLTK β€” text preprocessing (Porter stemmer, English stopwords)
  • scipy.sparse β€” efficient handling of TF-IDF + metadata feature combination

Sibling Projects

This is the sklearn / classical ML variant. Two LLM-based variants are in sibling folders:

  • ../spam-classifier-mlx/ β€” Apple MLX LoRA fine-tune of Qwen3.5-0.8B
  • ../spam-classifier-liquid/ β€” HuggingFace TRL+PEFT LoRA fine-tune of Liquid AI LFM2.5-1.2B

Citation

If you reference this work academically:

Balfour, D. (2026). Spam Email Classifier with Explainable AI.
ENGT 375 Applied Machine Learning project, Old Dominion University, Spring 2026.
https://huggingface.co/spaces/VoltageVagabond/spam-xai-classifier
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Dataset used to train VoltageVagabond/spam-xai-model

Space using VoltageVagabond/spam-xai-model 1