GVHD Severity Prediction - Final Results
Best Model: Stacking Ensemble (CatBoost + XGBoost + Neural Net)
| Metric | Value |
|---|---|
| AUC | 0.7083 ± 0.0117 |
| Baseline AUC | 0.7034 |
| Improvement | +0.0049 (+0.7%) |
| Brier (Platt Calibrated) | 0.2019 |
| Optimal Threshold | 0.544 |
| Sensitivity | 68.4% |
| Specificity | 58.8% |
| PPV | 76.9% |
| NPV | 48.3% |
Model Comparison
| Model | AUC Mean | AUC Std | Fold1 | Fold2 | Fold3 | Fold4 | Fold5 |
|---|---|---|---|---|---|---|---|
| CatBoost | 0.6963 | ±0.0105 | 0.700 | 0.711 | 0.699 | 0.679 | 0.694 |
| XGBoost | 0.6986 | ±0.0126 | 0.705 | 0.711 | 0.704 | 0.675 | 0.698 |
| NeuralNet | 0.6870 | ±0.0088 | 0.692 | 0.699 | 0.698 | 0.677 | 0.681 |
| Stacking | 0.7083 | ±0.0117 | 0.714 | 0.722 | 0.714 | 0.688 | 0.703 |
Calibration
| Method | Brier Score |
|---|---|
| Raw | 0.2150 |
| Platt Scaling | 0.2019 |
| Isotonic Regression | 0.2024 |
Key Improvements
- Feature Engineering: interactions, polynomials, log transforms, missingness indicators
- GPU Acceleration: Tesla T4 for CatBoost and Neural Net
- Stacking Ensemble: Logistic Regression meta-learner on OOF predictions
- Probability Calibration: Platt scaling for clinical deployment
Files
gvhd_gpu_pipeline.py- Complete pipeline code (every line commented)result_comparison_final.csv- Model comparison tableGVHD_Final_Report.ipynb- Jupyter notebook with tablescalibration_plot.png- Calibration curve
Honest Ceiling
Pre-transplant-only models plateau at AUC ≈ 0.71. Any claim > 0.75 requires post-transplant biomarkers (Day 7-14).
Generated by ML Intern
This model repository was generated by ML Intern, an agent for machine learning research and development on the Hugging Face Hub.
- Try ML Intern: https://smolagents-ml-intern.hf.space
- Source code: https://github.com/huggingface/ml-intern
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "cuimiandashi/gvhd-analysis"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
For non-causal architectures, replace AutoModelForCausalLM with the appropriate AutoModel class.
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support