Spam Detection System

Lite Model

The Lite model is a streamlined approach with optimized parameters and enhanced feature extraction designed for quick and efficient spam detection.

Text Preprocessing: Lemmatization, removal of stop words and punctuation.
Feature Extraction: Text length, word count, unique word count, uppercase count, special character count.
Model Creation: Ensemble model using SVC, MultinomialNB, and ExtraTreesClassifier.
Visualization: Generates graphs for dataset insights, word clouds, and performance metrics.
Metrics Saving: Accuracy, precision, and F1 score.

Use the Model:

import joblib
model = joblib.load('models/model.pkl')
vectorizer = joblib.load('models/vectorizer.pkl')

The Legacy model retains the original model logic without optimization but updates the structure and adds visualizations for spam detection.

Text Preprocessing: Porter Stemming, removal of stop words and punctuation.
Model Creation: Ensemble model using SVC, MultinomialNB, and ExtraTreesClassifier with original parameters.
Visualization: Generates graphs for dataset insights, word clouds, and performance metrics.
Metrics Saving: Accuracy and precision.

Use the Model:

import joblib
model = joblib.load('models/model.pkl')
vectorizer = joblib.load('models/vectorizer.pkl')

Dependencies: Python 3.6 or higher, pip, and required packages listed in requirements.txt.
Dataset: The dataset used for training is spam.csv.
Contact and Support: For questions or support, please contact the project maintainers.

For more details, you can refer to the README.md and models.md.