MIMIC Sepsis CQL
This repository contains the final selected Conservative Q-Learning (CQL) checkpoint from an academic offline reinforcement learning study on a MIMIC-IV v3.1 Sepsis-3 ICU cohort. The model card cites the canonical MIMIC-IV, PhysioNet, Sepsis-3, healthcare RL, CQL, and off-policy evaluation references listed below.
Clinical safety warning: This model is a retrospective research artifact. It is not a clinical decision support system and must not be used for patient care.
Repository Contents
config.json Root Hub query/config file for model metadata and download statistics
model/cql_epoch0200_step0007000.pt Final selected CQL checkpoint
configs/cql.yaml Training configuration
configs/runtime.mps.yaml Apple Silicon/MPS runtime configuration
evaluation/stage1_evaluation.json Stage 1 validation screening results
evaluation/stage2_evaluation.json Stage 2 validation evaluation results
evaluation/stage2_multiseed_validation.json Multi-seed validation summary
evaluation/final_test_evaluation.json Final held-out test evaluation
Intended Use
The checkpoint is provided for reproducibility and academic inspection of the reported offline RL experiment. It is intended for:
- reproducing reported CQL evaluation artifacts,
- inspecting model weights and training configuration,
- comparing validation-only selection and final held-out OPE reporting workflows.
It is not intended for clinical deployment, direct treatment recommendation, or real-time medical decision support.
Dataset
The study uses MIMIC-IV v3.1 data accessed through PhysioNet. Raw MIMIC-IV records and derived replay buffers are not included here because access is credentialed and subject to PhysioNet data use requirements.
Model Details
| Field | Value |
|---|---|
| Algorithm | Conservative Q-Learning |
| State dimension | 62 |
| Action space | 25 discrete actions |
| Decision interval | 4 hours |
| Reward variant | sparse |
| Learning rate | 1e-4 |
| CQL alpha | 0.05 |
| Seed | 1024 |
| Epoch | 200 |
Evaluation
The final checkpoint was selected using validation results only. The held-out test split was used once after final selection.
| Metric | Value |
|---|---|
| FQE mean | 15.689874 |
| FQE 95% CI | [15.616595, 15.755585] |
| WIS mean | 10.018438 |
| WIS 95% CI | [4.121083, 12.658275] |
| ESS | 10.408948 |
| Test episodes | 2585 |
| Bootstrap resamples | 1000 |
Loading the Checkpoint
The checkpoint is a PyTorch .pt artifact generated by the project training code. Use the GitHub repository code and configuration files to reconstruct the model architecture before loading weights.
import torch
checkpoint = torch.load("model/cql_epoch0200_step0007000.pt", map_location="cpu")
print(checkpoint.keys() if isinstance(checkpoint, dict) else type(checkpoint))
GitHub project: https://github.com/EnesDemir143/mimic-sepsis-drl
Download Statistics
Hugging Face model download counts are based on server-side requests to library/query files such as config.json, config.yaml, hyperparams.yaml, params.json, or library-specific weight patterns. This repository includes a root config.json so future Hub GET/HEAD requests have a standard query file for download counting. Counts are not retroactive and may update with delay.
Limitations
- Retrospective observational EHR data cannot establish prospective clinical benefit.
- Offline policy evaluation is sensitive to support mismatch and modeling assumptions.
- MIMIC-IV access restrictions prevent bundling the replay dataset with the model.
- This artifact has not been prospectively or clinically validated.
Author
Enes Demir — 230202066
Kocaeli University, Department of Computer Engineering
Citation
Please cite MIMIC-IV, PhysioNet, Conservative Q-Learning, and the project GitHub repository when using this artifact.
Dataset Citation
For MIMIC-IV v3.1, cite the PhysioNet dataset record as:
@article{PhysioNet-mimiciv-3.1,
author = {Johnson, Alistair and Bulgarelli, Lucas and Pollard, Tom and Gow, Brian and Moody, Benjamin and Horng, Steven and Celi, Leo Anthony and Mark, Roger},
title = {{MIMIC-IV}},
journal = {{PhysioNet}},
year = {2024},
month = oct,
note = {Version 3.1},
doi = {10.13026/kpb9-mt58},
url = {https://doi.org/10.13026/kpb9-mt58}
}
References
- Johnson et al., MIMIC-IV, a freely accessible electronic health record dataset, Scientific Data, 2023. DOI: 10.1038/s41597-022-01899-x.
- Johnson et al., MIMIC-IV (version 3.1), PhysioNet, 2024. DOI: 10.13026/kpb9-mt58.
- Goldberger et al., PhysioBank, PhysioToolkit, and PhysioNet, Circulation, 2000. DOI: 10.1161/01.CIR.101.23.e215.
- Singer et al., The Third International Consensus Definitions for Sepsis and Septic Shock (Sepsis-3), JAMA, 2016. DOI: 10.1001/jama.2016.0287.
- Gottesman et al., Guidelines for reinforcement learning in healthcare, Nature Medicine, 2019. DOI: 10.1038/s41591-018-0310-5.
- Kumar et al., Conservative Q-Learning for Offline Reinforcement Learning, NeurIPS, 2020. DOI: 10.48550/arXiv.2006.04779.
- Levine et al., Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems, arXiv, 2020. DOI: 10.48550/arXiv.2005.01643.
- Thomas and Brunskill, Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning, ICML, 2016. DOI: 10.48550/arXiv.1604.00923.
- Downloads last month
- 18