Supply Chain Delay Prediction (DataCo Dataset)
This repository contains the trained classification model, feature engineering pipelines, exploratory notebooks, and database integration schemas for predicting supply chain shipment delays.
π Model Performance
The predictive model was trained on the DataCo Supply Chain dataset (excluding post-shipment features to avoid data leakage):
- XGBoost Classifier (Best Model):
- Accuracy: 73.43%
- ROC-AUC: 83.62%
- Late Delivery Class Precision: 86% (high precision ensures low false-alarm rates for operations)
- Random Forest Classifier:
- Accuracy: 72.23%
- ROC-AUC: 83.68%
Top Feature Importances (XGBoost)
days_for_shipping_scheduled(53.4%) - The scheduled shipping window constraint.shipping_mode(23.5%) - Shipping level (First Class, Second Class, Same Day, Standard).order_hour(6.0%) - Time of day the order was placed.type(3.7%) - Transaction/Payment type (Debit, Transfer, etc.)
π Repository Structure
βββ Models/
β βββ best_xgb_model.json (XGBoost model structure)
β βββ best_xgb_model.pkl (XGBoost model pickle)
β βββ label_encoders.pkl (Categorical feature encoders)
β βββ scaler.pkl (Feature scaling parameters)
βββ Scripts/
β βββ data_cleaning.py (Preprocessing logic)
β βββ database_loader.py (MySQL database bulk uploader)
β βββ model_training.py (Machine learning pipeline)
βββ Notebooks/
β βββ 1_Data_Cleaning_EDA.ipynb (EDA & SQL queries)
β βββ 2_Delay_Prediction_ML.ipynb (Machine learning experiments)
βββ SQL/
β βββ schema.sql (MySQL 3NF schema definitions)
β βββ analysis_queries.sql (Analytical queries)
βββ requirements.txt (Python dependencies)
βββ PowerBI_Design_Guide.md (Power BI visualization layout design blueprint)
π οΈ How to Use
1. Requirements
Ensure you have the required libraries installed:
pip install -r requirements.txt
2. Loading the Model and Predicting in Python
import pickle
import pandas as pd
# Load encoders, scaler, and model
with open("Models/label_encoders.pkl", "rb") as f:
encoders = pickle.load(f)
with open("Models/scaler.pkl", "rb") as f:
scaler = pickle.load(f)
with open("Models/best_xgb_model.pkl", "rb") as f:
model = pickle.load(f)
# Example: Make a prediction (ensure feature engineering matches model_training.py)
# predictions = model.predict(X_scaled)
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support