# IMDB Sentiment Analysis Project ## Overview This project implements a sentiment analysis system for IMDB movie reviews using various machine learning and deep learning techniques. It includes a React frontend for user interaction and a Flask backend for processing and analyzing the reviews. ## Features - Sentiment analysis of IMDB movie reviews - Multiple machine learning models: - Naive Bayes (Gaussian NB) - Random Forest - Logistic Regression - LSTM - Transformer - Interactive web interface for real-time analysis - Visualization of model accuracies and dataset distribution - User feedback system for continuous improvement ## Technologies Used - Frontend: React, Recharts, Lucide React - Backend: Flask, NLTK, SpaCy, scikit-learn, TensorFlow/Keras - Data Processing: Pandas, NumPy - Machine Learning: scikit-learn, TensorFlow, Keras - Natural Language Processing: NLTK, SpaCy ## Setup Instructions ### Prerequisites - Node.js and npm - Python 3.7+ - Git ### Frontend Setup 1. Clone the repository: ``` git clone https://github.com/saquib34/zensibleInterview.git ``` 2. Navigate to the project directory: ``` cd zensibleInterview ``` 3. Install dependencies: ``` npm install ``` 4. Start the development server: ``` npm start ``` ### Backend Setup 1. Ensure you're in the project directory 2. Install required Python packages: ``` pip install -r requirements.txt ``` 3. Start the Flask server: ``` python app.py ``` ## Usage 1. Open your web browser and navigate to `http://localhost:3000` (or the port specified by your React setup) 2. Enter an IMDB movie review in the text input 3. Click "Analyze" to see the sentiment analysis results 4. (Optional) Provide feedback on the analysis accuracy ## Project Structure - `/src`: React frontend source code - `/public`: Public assets for the frontend - `/backend`: Flask backend code - `/models`: Trained machine learning models - `/data`: Dataset and data processing scripts - `requirements.txt`: Python dependencies - `package.json`: Node.js dependencies ## Dataset This project uses the IMDB Dataset of 50K Movie Reviews, available on Kaggle: [IMDB Dataset](https://www.kaggle.com/datasets/lakshmi25npathi/imdb-dataset-of-50k-movie-reviews) ## Models and Performance | Model | Accuracy | |---------------------|----------| | Gaussian NB | 0.7379 | | Random Forest | 0.7997 | | Logistic Regression | 0.82 | | LSTM | 0.7424 | | Transformer | 0.5 | ## Contributing Contributions to this project are welcome. Please fork the repository and submit a pull request with your changes. ## License [MIT License](LICENSE) ## Contact Developer: Saquib GitHub: [saquib34](https://github.com/saquib34) ## Acknowledgments - IMDB for providing the dataset - Kaggle for hosting the dataset - All open-source libraries and tools used in this project