spinopelvic-seg-checkpoints
5-fold nnU-Net v2 ResEnc-L ensemble for spine + pelvis CT segmentation,
trained with an LSTV-aware merged-label scheme that collapses L5/L6 into
a single last_lumbar class to handle lumbosacral transitional vertebrae.
Companion code, inference scripts, and evaluation pipeline: anonymous-mlhc/spinopelvic-seg
Training dataset: anonymous-mlhc/CTSpinoPelvic1K
Model summary
| Architecture | nnU-Net v2, ResEnc-L (ResEncUNet) 3D |
| Configuration | 3d_fullres |
| Planner | nnUNetResEncUNetPlans_100G (100 GB GPU memory target) |
| Trainer | nnUNetTrainerWandB_500ep_LSTVOversample (custom, in companion repo) |
| Folds | 5-fold cross-validation ensemble |
| Classes | 9 contiguous: background, L1, L2, L3, L4, last_lumbar, sacrum, left_hip, right_hip + ignore |
| Training epochs | 500 per fold |
| Training hardware | NVIDIA H200 / A100-80GB |
The last_lumbar class merges what would otherwise be separate L5 and L6
labels. This eliminates the L5βL6 channel-swap failure mode that affects
fixed-class segmenters on sacralization-count cases (where the lumbar
spine has 4 mobile segments instead of 5).
Files
Dataset803_SpineSurgCTFullMerged/
βββ nnUNetTrainerWandB_500ep_LSTVOversample__nnUNetResEncUNetPlans_100G__3d_fullres/
βββ plans.json
βββ dataset.json
βββ dataset_fingerprint.json
βββ fold_0/
β βββ checkpoint_best.pth
β βββ checkpoint_final.pth
β βββ debug.json
β βββ progress.png
β βββ training_log_*.txt
βββ fold_1/ (same structure)
βββ fold_2/ (same structure)
βββ fold_3/ (same structure)
βββ fold_4/ (same structure)
checkpoint_best.pth is the recommended inference checkpoint for each fold
(selected by validation EMA Dice). checkpoint_final.pth is the last
training epoch.
Quickstart
1. Install dependencies
pip install nnunetv2 huggingface_hub
git clone https://github.com/anonymous-mlhc/spinopelvic-seg.git
cd spinopelvic-seg
pip install -r requirements.txt
2. Download checkpoints
hf download anonymous-mlhc/spinopelvic-seg-checkpoints \
--repo-type=model \
--local-dir nnunet/results/Dataset803_SpineSurgCTFullMerged
3. Set environment
export nnUNet_raw=$PWD/nnunet/raw
export nnUNet_preprocessed=$PWD/nnunet/preprocessed
export nnUNet_results=$PWD/nnunet/results
export PYTHONPATH=$PWD/tools:$PYTHONPATH # makes the custom trainer importable
4. Run inference
Input files must follow nnU-Net's channel-suffix convention
(CASE_0000.nii.gz):
nnUNetv2_predict \
-i /path/to/input_cts \
-o /path/to/predictions \
-d 803 \
-c 3d_fullres \
-p nnUNetResEncUNetPlans_100G \
-tr nnUNetTrainerWandB_500ep_LSTVOversample \
-f 0 1 2 3 4 \
-chk checkpoint_best.pth
Drop -f 0 1 2 3 4 to -f 0 for single-fold inference (~1 point Dice
hit vs the 5-fold ensemble, ~5Γ faster).
Label scheme
| Label | Class |
|---|---|
| 0 | background |
| 1 | L1 |
| 2 | L2 |
| 3 | L3 |
| 4 | L4 |
| 5 | last_lumbar (L5 in normals; L5/L6 fused in lumbarization) |
| 6 | sacrum |
| 7 | left_hip |
| 8 | right_hip |
| 9 | ignore (excluded from loss and metrics) |
Intended use
- Research-grade segmentation of spine and pelvis on CT for ML benchmarking, anatomy analytics, and downstream LSTV-related studies.
- LSTV characterization β the merged-label scheme is designed for studies that need to characterize the lumbosacral junction without forcing a per-case decision on vertebral count.
Out-of-scope use
- Clinical diagnostic use. Not FDA-cleared, not CE-marked, no clinical validation.
- Patient-specific surgical planning without expert review.
- Pediatric CT. Training data is adult.
- Severe pathology (large tumors, hardware artifacts, multi-level fusion) is underrepresented in the training set.
Training
Trained on the CTSpinoPelvic1K dataset (companion HuggingFace dataset repo). 5-fold cross-validation at the patient-token level with stratification on match-type Γ LSTV subtype to prevent patient leakage and ensure each fold sees the LSTV-subtype distribution.
The custom trainer (nnUNetTrainerWandB_500ep_LSTVOversample in the
companion code repo) adds: queue-based LSTV-case oversampling, CE
reweighting on the merged-lumbar and sacrum classes, dedicated LSTV
validation passes, and W&B logging with NaN-safe Dice aggregation and
offline fallback.
See the companion repo
for the full training pipeline (make preprocess, make train-array),
the trainer source (tools/nnunet_wandb_variant.py), and ablation
configurations.
Evaluation
tools/eval_full.py in the companion repo computes per-case Dice,
junction-DSC over a 40 mm L5/S1 window, voxel confusion blocks for the
L4 β last_lumbar β sacrum boundary classes, and last_lumbar specificity
on sacralization-count cases.
License
Apache-2.0. See the companion repo.