L4GM-demo / readme.md
fffiloni's picture
Migrated from GitHub
2cdb96e verified

A newer version of the Gradio SDK is available: 5.9.1

Upgrade

L4GM: Large 4D Gaussian Reconstruction Model

Paper | Project Page | Model Weights

We present L4GM, the first 4D Large Reconstruction Model that produces animated objects from a single-view video input -- in a single feed-forward pass that takes only a second.


Install

conda env create -f environment.yml
conda activate l4gm

Inference

Download pretrained L4GM model and 4D interpolation model to pretrained/recon.safetensors and pretrained/interp.safetensors respectively.

Select an input video. Remove its background and crop it to 256x256 with third-party tools. We provide some processed examples in the data_test folder.

  1. Generate 3D by:
python infer_3d.py big --workspace results --resume pretrained/recon.safetensors --num_frames 1 --test_path data_test/otter-on-surfboard_fg.mp4
  1. Generate 4D by:
python infer_4d.py big --workspace results --resume pretrained/recon.safetensors --interpresume pretrained/interp.safetensors --num_frames 16 --test_path data_test/otter-on-surfboard_fg.mp4

Training

Render Objaverse with Blender scripts in the blender_scripts folder first.

Download pretrained LGM to pretrained/model_fixrot.safetensors.

L4GM model training:

accelerate launch \
    --config_file acc_configs/gpu8.yaml \
    main.py big \
    --workspace workspace_recon \
    --resume pretrained/model_fixrot.safetensors \
    --data_mode 4d \
    --num_epochs 200 \
    --prob_cam_jitter 0 \
    --datalist data_train/datalist_8fps.txt \

Our released checkpoint uses --num_epochs 500.

4D Interpolation model training:

accelerate launch \
    --config_file acc_configs/gpu8.yaml \
    main.py big \
    --workspace workspace_interp \
    --resume workspace_recon/model.safetensors \
    --data_mode 4d_interp \
    --num_frames 4 \
    --num_epochs 200 \
    --prob_cam_jitter 0 \
    --prob_grid_distortion 0 \
    --datalist data_train/datalist_24fps.txt \

Citation

@inproceedings{ren2024l4gm,
    title={L4GM: Large 4D Gaussian Reconstruction Model}, 
    author={Jiawei Ren and Kevin Xie and Ashkan Mirzaei and Hanxue Liang and Xiaohui Zeng and Karsten Kreis and Ziwei Liu and Antonio Torralba and Sanja Fidler and Seung Wook Kim and Huan Ling},
    booktitle={Proceedings of Neural Information Processing Systems(NeurIPS)},
    month = {Dec},
    year={2024}
}