metadata
sdk: streamlit
sdk_version: 1.50.0
🧪 Advanced ML Sentiment Lab
📌 Overview
Interactive Streamlit + Plotly app for binary sentiment analysis.
Upload any CSV with a text column and a binary label, then:
- Run quick EDA on text lengths, tokens, and class balance
- Build TF-IDF word + optional char features
- Train multiple classical models (LogReg / RF / GB / Naive Bayes)
- Tune the decision threshold with FP/FN business costs
- Inspect misclassified samples and test arbitrary texts live
Works well with the classic IMDB 50K Reviews dataset, but is generic enough for product reviews, tickets, surveys, etc.
📊 Dashboard Preview
EDA & KPIs
Train & Validation
Error Analysis
Deploy & Interactive Prediction
🚀 How to use (in this Space)
Load data
- Upload a CSV file
- Or place
IMDB Dataset.csv/imdb.csvin the Space and reload
Map columns
- Choose the text column
- Choose the label column and map which values are positive vs negative
Train models
- Go to “Train & Validation”
- Set TF-IDF options, pick models, click Train models
Analyse & deploy
- Use “Threshold & Cost” to pick a business-aware threshold
- Check “Compare Models” + “Error Analysis”
- In “Deploy”, try any text and see the predicted sentiment + confidence bar
No data is stored server-side beyond the current session.
🧠 Under the hood
Features
- Word TF-IDF (1–3 n-grams)
- Optional char TF-IDF (3–6 n-grams)
Models
- Logistic Regression (balanced)
- Random Forest
- Gradient Boosting
- Multinomial Naive Bayes
Artifacts
- Saved under
models_sentiment_lab/:vectorizers.joblib,models.joblib,results.joblib,metadata.joblib
- Reused by Threshold, Compare, Error Analysis, and Deploy tabs
- Saved under
🖥 Run locally
git clone https://github.com/tarekmasryo/advanced-ml-sentiment-lab.git
cd advanced-ml-sentiment-lab
python -m venv .venv
# Windows: .venv\Scripts\activate
source .venv/bin/activate
pip install -r requirements.txt
streamlit run app.py
📄 License & credit
Code: Apache 2.0
Space & dashboard by Tarek Masryo 🚀



