Spaces:

akshatsanghvi
/

spam-email-detection

Sleeping

App Files Files Community

spam-email-detection / README.md

Akshat Sanghvi

Update README.md

d75c366 unverified 22 days ago

preview code

raw history blame contribute delete

No virus

2.6 kB

	---
	title: Spam Email Detection
	emoji: 💌
	colorFrom: pink
	colorTo: blue
	sdk: gradio
	sdk_version: 3.17.0
	app_file: app.py
	---

	# Email Spam and Phishing URL Detection

	This project utilizes Naive Bayes classification to detect whether an email is spam or not, and XGBoost classification to determine if a URL within an email is phishing or legitimate.

	# Getting Started
	## Project Overview

	The project consists of two main components:

	1. Email Spam Detection: This component employs Naive Bayes classification to classify emails as either spam or not spam based on their content features.

	2. Phishing URL Detection: This component uses XGBoost classification to identify whether URLs within emails are associated with phishing attempts or legitimate websites.

	## Prerequisites
	Make sure you have Python 3.10 installed on your system. You can download it from [](python.org)

	## Requirements
	Ensure you have the following dependencies installed. You can install them using `pip install -r requirements.txt`.

	- gunicorn==22.0.0
	- python-dateutil==2.8.2
	- gradio==4.32.1
	- gradio_client==0.17.0
	- requests==2.31.0
	- beautifulsoup4==4.12.3
	- googlesearch_python==1.2.4
	- urlextract==1.9.0
	- numpy==1.26.3
	- pandas==2.2.0
	- scikit-learn==1.5.0
	- urllib3==2.1.0
	- python-whois==0.9.4
	- xgboost==2.0.3
	- lxml==5.2.2

	## Setup and Installation

	1. Clone the repository:

	```bash
	git clone https://github.com/your-username/email-spam-phishing-detection.git
	cd email-spam-phishing-detection

	2. Install dependencies:
	```bash
	pip install -r requirements.txt```

	## Usage
	1. Data Preparation:
	- Ensure the datasets `spam.csv` and `urldata.csv` are available in the `data/` directory.

	2. Model Training:
	- If necessary, modify and run the `notebook.ipynb` Jupyter notebook to train or fine-tune the machine learning models.
	- Trained models will be saved in the `models/` directory.

	3. Run the Application:
	- Execute `app.py` to start the application.
	- Access the application at [Hugging Face Space](https://huggingface.co/spaces/akshatsanghvi/spam-email-detection)

	## Acknowledgements

	- The email spam classification model is trained using the `spam.csv` dataset, sourced from [Dataset: Spam/ham mail](https://www.kaggle.com/datasets/mfaisalqureshi/spam-email?resource=download)).
	- The URL phishing detection model is trained using the `urldata.csv` dataset, sourced from [Phishing Websites Dataset](https://www.kaggle.com/datasets).

	## License

	This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.