VERA checkpoints

Hosted weights for VERATurning Video Models into Generalist Robot Policies — a two-stage system: a video planner that "dreams" the future, and a Jacobian inverse-dynamics model (IDM) that turns the dream into robot actions. VERA hosts only the trained artifacts; frozen upstream pieces (Wan2.1 text-enc/VAE, VGGT) are pulled from their original homes and filtered out at inference.

Code + usage: see the VERA GitHub repo.

Release groups

1 · Panda-sim (MimicGen)

dir what
mimicgen-wan-1.3b/ MimicGen-specialist WAN video planner (1.3B, t2v→v2v), DiT-only bf16 (~2.8 GB, de-bloated from 17.5 GB) + flow_decoder.ckpt + algo_config.yaml. Pull Wan-AI/Wan2.1-T2V-1.3B base upstream.
idm-mimicgen-37oa162u/ the proven-recipe Jacobian IDM (run 37oa162u) behind the 83–94% stack_d0 result.

Pair these two + cotracker + the gated/adaptive gripper recipe to reproduce stack_d0.

2 · PushT — reproduces the two-stage DFoT→Jacobian solve

dir what
pusht-dfot/ DFoT video planner (run dvxixf6d) — a U-Net3D flow predictor (~2.4M params), not WAN. + run_config.yaml.
pusht-idm/ PushT Jacobian IDM (run j1j59qzz, ~34.8M params, 2-DOF) + config.yaml.

3 · OMNI — the cross-embodiment WAN planner

dir what
omni-wan/ OMNI WAN planner (14B i2v→v2v, combined_4env), DiT-only bf16 (~33 GB). Pull Wan-AI/Wan2.1-I2V-14B-480P base upstream.
idm-mimicgen/ MimicGen IDM (x21o0cwe) — the OMNI-default pairing.
idm-droid/ DROID FR3 IDM (7wohna95, SE3-delta 7-DOF).
Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading