S23DR 2026 Winning Solution 🔥

This repository contains the winning solution 🏆 for the S23DR 2026 roof wireframe reconstruction challenge. The task is to recover a structured 3D roof wireframe from scene reconstruction inputs: COLMAP geometry, camera information, depth-derived cues, and semantic/gestalt image signals. The final submission predicts the wireframe vertices, edges, and classifications expected by the Hugging Face competition interface.

The full technical writeup is included as S23DR_2026_writeup_final.pdf.

Challenge 🏠

S23DR 2026 evaluates how well a system can reconstruct clean, metric 3D roof structure from noisy multi-view scene data. This is harder than ordinary dense reconstruction because the output is not just a point cloud or mesh: it must be a compact graph of meaningful roof vertices and edges. Good submissions need both geometric accuracy and topological consistency.

The competition uses a custom score reported through HSS, with supporting F1 and IoU diagnostics during local validation. In practice, strong performance depends on recovering the important roof junctions and line segments while avoiding spurious vertices and broken topology.

Approach ⚙️

The solution treats roof wireframe recovery as a generative structured prediction problem. It builds a fixed-size scene representation from COLMAP points, camera tokens, depth-unprojected points, RGB, and semantic/gestalt cues, then predicts the roof graph with a trained diffusion model.

The final system uses:

A first-stage wireframe diffusion model for coarse vertex and edge prediction.
A second-stage refinement model focused around the first-stage roof hull.
Online preprocessing that mirrors the training-time scene construction.
Inference-time ensembling over multiple random trajectories.
Lightweight candidate ranking and medoid/confidence selection to improve stability.

This combination was designed to handle the main failure modes of the challenge: noisy reconstruction points, missing roof evidence, duplicate or unstable vertices, and topology that changes under small sampling perturbations.

Repository Contents 📦

S23DR_2026_writeup_final.pdf - final solution writeup.
script.py - competition entry point used for local sanity checks and submission inference.
models/ - diffusion, denoising, scene encoding, and stage-2 model components.
data/ - preprocessing and dataset utilities for scene and stage-2 inputs.
test_checkpoint*.pth - trained checkpoints used by the submitted system.
ensemble_ranker*.npz - saved lightweight ensemble selector/ranker artifacts.
eval.slurm and eval_cpu.slurm - validation scripts for GPU and CPU runs.
DIARY_2026_05.md - experiment notes and debugging history.

Running 🚀

For a quick local sanity check on validation samples:

python script.py --sanity --sanity_n 1

For the normal competition-style inference path:

python script.py --params params.json --output submission.json

The script resolves checkpoints from CLI flags, environment variables, params.json, or the committed defaults:

python script.py --ckpt test_checkpoint.pth --stage2_ckpt test_checkpoint_stage2.pth

On SLURM:

sbatch eval.slurm
sbatch eval_cpu.slurm

Output ✅

The generated submission.json contains one prediction per sample with:

order_id
wf_vertices
wf_edges
wf_classifications

These fields match the S23DR 2026 submission schema in params.json.

License

Code in this repository is licensed under Apache-2.0.

Model weights/checkpoints are licensed under CC-BY-NC-4.0 and are provided for non-commercial academic research and S23DR/HoHo challenge reproducibility.

These weights were trained using the gated HoHo/S23DR dataset. This repository does not redistribute the dataset, dataset links, cached features, labels, samples, or artifacts intended to reconstruct the dataset. Access to and use of the original dataset remain governed by the official S23DR/HoHo Data License Agreement.

Downloads last month: 97

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

jskvrna
/

2026_s23dr