Pixel-Perfect Depth with Semantics-Prompted Diffusion Transformers
Paper β’ 2510.07316 β’ Published β’ 3
This is an unmodified mirror of the ppd_moge.pth checkpoint from
Pixel-Perfect Depth (NeurIPS 2025),
the PPD variant that uses MoGe2 semantics and delivers a ~20β30% improvement on
zero-shot benchmarks over the DA2 variant.
It is rehosted here only because the original file is distributed via Google Drive, which is unreliable for automated downloads in the ComfyUI-PixelPerfectDepth integration. All credit belongs to the original authors.
1tabmcsbRVDKDfmO4KU1vOjurzN-wp0HV
(linked from the upstream README, "PPD / MoGe2" row)This mirror is unmodified and redistributed under the upstream Apache-2.0 license
(see LICENSE). No endorsement by the original authors is implied.
This checkpoint requires the MoGe2 encoder weights
(moge2.pt) at load time, as in
the upstream run.py --semantics_model MoGe2.
@article{xu2025pixel,
title={Pixel-perfect depth with semantics-prompted diffusion transformers},
author={Xu, Gangwei and Lin, Haotong and Luo, Hongcheng and others},
journal={arXiv preprint arXiv:2510.07316},
year={2025}
}