GeometryCrafter: Consistent Geometry Estimation for Open-world Videos with Diffusion Priors

Tian-Xing Xu1, Xiangjun Gao3, Wenbo Hu2 †, Xiaoyu Li2, Song-Hai Zhang1 †, Ying Shan2
1Tsinghua University 2ARC Lab, Tencent PCG 3HKUST

Version        

πŸ”† Notice

GeometryCrafter is still under active development!

We recommend that everyone use English to communicate on issues, as this helps developers from around the world discuss, share experiences, and answer questions together. For further implementation details, please contact xutx21@mails.tsinghua.edu.cn. For business licensing and other related inquiries, don't hesitate to contact wbhu@tencent.com.

If you find GeometryCrafter useful, please help ⭐ this repo, which is important to Open-Source projects. Thanks!

πŸ“ Introduction

We present GeometryCrafter, a novel approach that estimates temporally consistent, high-quality point maps from open-world videos, facilitating downstream applications such as 3D/4D reconstruction and depth-based video editing or generation.

Release Notes:

  • [01/04/2025] πŸ”₯πŸ”₯πŸ”₯GeometryCrafter is released now, have fun!

πŸš€ Quick Start

Installation

  1. Clone this repo:
git clone --recursive https://github.com/TencentARC/GeometryCrafter
  1. Install dependencies (please refer to requirements.txt):
pip install -r requirements.txt

Inference

Run inference code on our provided demo videos at 1.27FPS, which requires a GPU with ~40GB memory for 110 frames with 1024x576 resolution:

python run.py \
  --video_path examples/video1.mp4 \
  --save_folder workspace/examples_output \
  --height 576 --width 1024
  # resize the input video to the target resolution for processing, which should be divided by 64 
  # the output point maps will be restored to the original resolution before saving
  # you can use --downsample_ratio to downsample the input video or reduce --decode_chunk_size to save the memory usage

Run inference code with our deterministic variant at 1.50 FPS

python run.py \
  --video_path examples/video1.mp4 \
  --save_folder workspace/examples_output \
  --height 576 --width 1024 \
  --model_type determ

Run low-resolution processing at 2.49 FPS, which requires a GPU with ~22GB memory:

python run.py \
  --video_path examples/video1.mp4 \
  --save_folder workspace/examples_output \
  --height 384 --width 640

Visualization

Visualize the predicted point maps with Viser

python visualize/vis_point_maps.py \
  --video_path examples/video1.mp4 \
  --data_path workspace/examples_output/video1.npz

πŸ€– Gradio Demo

πŸ“Š Dataset Evaluation

Please check the evaluation folder.

  • To create the dataset we use in the paper, you need to run evaluation/preprocess/gen_{dataset_name}.py.
  • You need to change DATA_DIR and OUTPUT_DIR first accordint to your working environment.
  • Then you will get the preprocessed datasets containing extracted RGB video and point map npz files. We also provide the catelog of these files.
  • Inference for all datasets scripts:
    bash evaluation/run_batch.sh
    
    (Remember to replace the data_root_dir and save_root_dir with your path.)
  • Evaluation for all datasets scripts (scale-invariant point map estimation):
    bash evaluation/eval.sh
    
    (Remember to replace the pred_data_root_dir and gt_data_root_dir with your path.)
  • Evaluation for all datasets scripts (affine-invariant depth estimation):
    bash evaluation/eval_depth.sh
    
    (Remember to replace the pred_data_root_dir and gt_data_root_dir with your path.)
  • We also provide the comparison results of MoGe and the deterministic variant of our method. You can evaluate these methods under the same protocol by uncomment the corresponding lines in evaluation/run.sh evaluation/eval.sh evaluation/run_batch.sh and evaluation/eval_depth.sh.

🀝 Contributing

  • Welcome to open issues and pull requests.
  • Welcome to optimize the inference speed and memory usage, e.g., through model quantization, distillation, or other acceleration techniques.
Downloads last month
72
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for TencentARC/GeometryCrafter

Finetuned
(4)
this model

Spaces using TencentARC/GeometryCrafter 2