Align3R: Aligned Monocular Depth Estimation for Dynamic Videos Jiahao Lu*, Tianyu Huang*, Peng Li, Zhiyang Dou, Cheng Lin, Zhiming Cui, Zhen Dong, Sai-Kit Yeung, Wenping Wang, Yuan Liu Arxiv, 2024.

Align3R estimates temporally consistent video depth, dynamic point clouds, and camera poses from monocular videos.

@article{lu2024align3r,
  title={Align3R: Aligned Monocular Depth Estimation for Dynamic Videos},Jiahao Lu, Tianyu Huang, Peng Li, Zhiyang Dou, Cheng Lin, Zhiming Cui, Zhen Dong, Sai-Kit Yeung, Wenping Wang, Yuan Liu
  author={Lu, Jiahao and Huang, Tianyu and Li, Peng and Dou, Zhiyang and Lin, Cheng and Cui, Zhiming and Dong, Zhen and Yeung, Sai-Kit and Wang, Wenping and Liu,Yuan},
  journal={arXiv preprint arXiv:2412.03079},
  year={2024}
}

How to use

First, install Align3R. To load the model:

from dust3r.model import AsymmetricCroCo3DStereo
import torch
model = AsymmetricCroCo3DStereo.from_pretrained("cyun9286/Align3R_DepthAnythingV2_ViTLarge_BaseDecoder_512_dpt")
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
Downloads last month
212
Safetensors
Model size
603M params
Tensor type
F32
ยท
Inference API
Inference API (serverless) does not yet support align3r models for this pipeline type.

Space using cyun9286/Align3R_DepthAnythingV2_ViTLarge_BaseDecoder_512_dpt 1