MDA β€” Multi-view depth & geometry checkpoints

These are the official model checkpoints for the paper "Modeling Depth Ambiguity: A Mixture-Density Representation for Flying-Point-Free Depth Estimation" (MDA).

πŸ“„ arXiv  |  🌐 Project page

MDA is a mixture-density depth representation that predicts several depth hypotheses (with their probabilities) at every pixel instead of forcing a single depth, which largely removes the flying-point artifacts at object boundaries that plague feed-forward depth estimators. See the Citation section to cite this work.

These two checkpoints are used for multi-view geometry prediction β€” spatially consistent depth and camera pose from a set of input images. They are built on two different backbones and trained with a Mixture-of-Gaussians (MoG) depth head and a logl2 objective.

File Backbone Wrapper model_choice.py name Params
DA3_MOG_Sky_LogL2.ckpt DA3 Giant DA3Wrapper mda_mog_sky_l2 ~1.36 B
VGGT_MOG_LogL2.ckpt VGGT-1B VGGTWrapper vggt_mog_l2 ~1.16 B

Both are PyTorch Lightning checkpoints (save_weights_only=True, Lightning 2.5.6). State-dict keys are prefixed net.net.* because the network is wrapped by a Lightning module β€” strip the prefix and load into the bare net. These are research checkpoints and are not loadable through the standard DepthAnything3.from_pretrained HF API.

Citation

If you build on MDA, please cite:

@misc{bian2026modeling,
  title         = {Modeling Depth Ambiguity: A Mixture-Density Representation for Flying-Point-Free Depth Estimation},
  author        = {Siyuan Bian and Congrong Xu and Jun Gao},
  year          = {2026},
  eprint        = {2606.02552},
  archivePrefix = {arXiv},
  primaryClass  = {cs.CV},
  url           = {https://arxiv.org/abs/2606.02552}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Paper for sy000/MDA