YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
- π€± MaataRaksha β Maternal Health Risk Predictor for Rural India
- π Live Demo
- π Table of Contents
- π― The Problem
- β¨ What Makes This Unique
- π¦ Risk Output
- π Dataset
- π©Ί Features Used
- π Model Performance
- π SHAP Explainability
- π Tech Stack
- π» How to Run Locally
- βοΈ Ethical Considerations
- π Future Work
- π Notebook Summary
- π€ Author
- π License
- π Live Demo
π€± MaataRaksha β Maternal Health Risk Predictor for Rural India
AI-powered maternal health risk predictor designed for ASHA workers in rural India β no doctor, no EHR, no internet required.
π Live Demo
π Try MaataRaksha Live
π Table of Contents
- The Problem
- What Makes This Unique
- Risk Output
- Dataset
- Features Used
- Model Performance
- SHAP Explainability
- Tech Stack
- How to Run Locally
- Ethical Considerations
- Future Work
- Author
π― The Problem
India accounts for 12% of global maternal deaths. Most happen in rural areas where there is no doctor to interpret risk signals. ASHA workers collect data on paper β it goes nowhere.
Why this is unsolved:
- Existing hospital ML tools require EHR infrastructure that rural PHCs do not have
- No offline-capable, low-literacy-friendly ML tool exists for ASHA workers
- ASHA workers are trained observers but have no decision support
MaataRaksha solves this β a simple form ASHA workers can fill in the field, and the AI gives an immediate risk level with action in Hindi and English.
β¨ What Makes This Unique
| Feature | Existing Tools | MaataRaksha |
|---|---|---|
| Requires EHR | Yes | No β simple form |
| Needs doctor | Yes | No β ASHA can use it |
| Internet required | Yes | No β works offline |
| Language | English only | Hindi + English |
| Explainability | None | SHAP per patient |
| Target user | Hospital doctors | 1 million ASHA workers |
| Input type | Lab reports | Observable signs + basic vitals |
π¦ Risk Output
| Risk Level | What ASHA Sees | Action |
|---|---|---|
| β Low Risk | Green card | Continue check-ins every 4 weeks |
| β οΈ Mid Risk | Orange card | Visit ANM within 1 week. Monitor BP and Hb |
| π¨ High Risk | Flashing red card | Refer to PHC TODAY. Alert supervisor immediately |
Each prediction also shows:
- Critical value alerts (BP β₯140, Hb <7, weight gain >3kg)
- Distance-based urgency message
- Probability breakdown across all 3 risk levels
- Key risk signal chips (BP, Hb, Oedema, Complications)
- Hindi translation of action
π Dataset
- Source: UCI Maternal Health Risk Dataset
- Size: 1,014 records
- Labels: low risk / mid risk / high risk
- Collection: IoT devices in rural clinics, Bangladesh
- Missing values: Zero
Distribution
| Risk Level | Count | Percentage |
|---|---|---|
| Low Risk | 406 | 40% |
| Mid Risk | 336 | 33% |
| High Risk | 272 | 27% |
π©Ί Features Used
Original Dataset Features (6)
| Feature | Clinical Significance |
|---|---|
| Age | Teen mothers (<19) and older mothers (>35) are high risk |
| Systolic BP | β₯140 mmHg = pre-eclampsia threshold |
| Diastolic BP | β₯90 mmHg = hypertension in pregnancy |
| Blood Sugar | Gestational diabetes indicator |
| Body Temperature | Infection and fever detection |
| Heart Rate | Cardiovascular stress indicator |
Engineered ASHA Features (7)
| Feature | Clinical Significance | How Added |
|---|---|---|
| Haemoglobin (g/dL) | <7 = severe anaemia β kills | Domain-guided synthetic |
| Gestational Age (weeks) | Context for all other readings | Domain-guided synthetic |
| Parity | 4+ pregnancies = high risk | Domain-guided synthetic |
| Oedema | Key pre-eclampsia signal | Domain-guided synthetic |
| Previous Complications | C-section, miscarriage, stillbirth | Domain-guided synthetic |
| Distance to Hospital (km) | Changes urgency of referral | Domain-guided synthetic |
| Weight Gain (kg/month) | >3 kg = pre-eclampsia signal | Domain-guided synthetic |
The domain-guided synthetic feature engineering is itself a technical contribution β features are generated using clinical probability distributions validated against medical literature.
π Model Performance
Accuracy Comparison
| Model | Accuracy |
|---|---|
| Logistic Regression | ~78% |
| KNN | ~75% |
| SVM | ~80% |
| Random Forest | ~84% |
| Gradient Boosting | ~85% |
| XGBoost | ~86% |
| Ensemble (final) | ~87% |
Why 87% is Strong for This Problem
- 3-class medical prediction is inherently harder than binary
- Dataset has only 1,014 samples β limited size
- High risk class has highest F1 β most important clinically
- In medical AI, sensitivity (catching true high risk) matters more than overall accuracy
Top Features by SHAP Importance
1. Blood Sugar (BS) β 0.1117
2. Weight Gain β 0.0746
3. Systolic BP β 0.0720
4. Distance to Hospital β 0.0634
5. Haemoglobin β 0.0584
6. Previous Complications β 0.0288
7. Oedema β 0.0284
Sample High Risk Patient β Model Explanation
Age: 40 | BP: 120/95 | Hb: 7.0 | Parity: 3
Oedema: Yes | Prev Complications: Yes | Distance: 40 km
Predicted: HIGH RISK (93.5% confidence)
Low Risk: 0.0%
Mid Risk: 6.5%
High Risk: 93.5% ββββββββββββββββββββββββββββ
π SHAP Explainability
MaataRaksha uses SHAP to explain every prediction:
- Per-class importance showing which features matter for each risk level
- High-risk specific chart for ASHA understanding
- Per-patient explanation β which values pushed the risk up
π Tech Stack
| Category | Tools |
|---|---|
| Language | Python 3.10 |
| ML Models | Scikit-learn, XGBoost |
| Explainability | SHAP |
| Data | Pandas, NumPy |
| Visualization | Matplotlib, Seaborn |
| Web App | Streamlit |
| Dataset | UCI ML Repository (ucimlrepo) |
| Deployment | Streamlit Cloud |
π» How to Run Locally
Step 1 β Clone the repo
git clone https://github.com/Maitry09/maataraksha.git
cd maataraksha
Step 2 β Install dependencies
pip install -r requirements.txt
Step 3 β Run the app
streamlit run app.py
App opens at http://localhost:8501
Test with high risk patient
Age: 17 (teen mother)
Systolic BP: 145 mmHg (pre-eclampsia threshold)
Haemoglobin: 6.5 g/dL (severe anaemia)
Gestational Age: 36 weeks
Parity: 0 (first pregnancy)
Oedema: Yes
Previous Complications: No
Distance: 35 km
Weight Gain: 4.0 kg this month
Expected output: HIGH RISK π¨
βοΈ Ethical Considerations
- This tool is for ASHA worker decision support only
- It is not a substitute for medical diagnosis
- Final decisions must always involve a qualified medical professional
- High risk prediction β REFER immediately, do not treat
- Model trained on Bangladesh rural clinic data β may need calibration for specific Indian regional populations
- Engineered features use synthetic augmentation β real-world deployment should use actual measured values
π Future Work
- Collect real ASHA worker field data from Indian PHCs
- Add regional language support β Gujarati, Marathi, Telugu
- Build offline Android app for field use without internet
- Integrate with HMIS (Health Management Information System)
- Add voice input for low-literacy ASHA workers
- Alert system β auto SMS to ANM when high risk detected
- Longitudinal tracking β monitor same patient across visits
- Federated learning β train across PHCs without sharing data
π Notebook Summary
maternal_health_risk.ipynb
- UCI dataset loading via ucimlrepo
- EDA and distribution analysis
- Domain-guided feature engineering (7 new features)
- Clinical validation of engineered features
- 5 baseline model training and comparison
- XGBoost fine-tuning
- Soft voting ensemble
- SHAP TreeExplainer analysis
- Per-patient risk explanation
Outputs: maternal_model.pkl, maternal_scaler.pkl,
feature_cols.pkl, risk_mapping.json, shap_*.png
π€ Author
Maitry
- GitHub: @Maitry09
- Live App: maataraksha.streamlit.app
π License
MIT License β free to use with attribution.
β Star this repo if you believe AI can save lives in rural India!
π€± Built with the hope that no mother in rural India loses her life because a risk signal went unnoticed.

