VERA checkpoints

Hosted weights for VERA — Turning Video Models into Generalist Robot Policies — a two-stage system: a video planner that "dreams" the future, and a Jacobian inverse-dynamics model (IDM) that turns the dream into robot actions. VERA hosts only the trained artifacts; frozen upstream pieces (Wan2.1 text-enc/VAE, VGGT) are pulled from their original homes and filtered out at inference.

Code + usage: see the VERA GitHub repo.

Release groups

1 · Panda-sim (MimicGen)

dir	what
`mimicgen-wan-1.3b/`	MimicGen-specialist WAN video planner (1.3B, t2v→v2v), DiT-only bf16 (~2.8 GB, de-bloated from 17.5 GB) + `flow_decoder.ckpt` + `algo_config.yaml`. Pull `Wan-AI/Wan2.1-T2V-1.3B` base upstream.
`idm-mimicgen-37oa162u/`	the proven-recipe Jacobian IDM (run `37oa162u`) behind the 83–94% `stack_d0` result.

Pair these two + cotracker + the gated/adaptive gripper recipe to reproduce stack_d0.

2 · PushT — reproduces the two-stage DFoT→Jacobian solve

dir	what
`pusht-dfot/`	DFoT video planner (run `dvxixf6d`) — a U-Net3D flow predictor (~2.4M params), not WAN. + `run_config.yaml`.
`pusht-idm/`	PushT Jacobian IDM (run `j1j59qzz`, ~34.8M params, 2-DOF) + `config.yaml`.

3 · OMNI — the cross-embodiment WAN planner

dir	what
`omni-wan/`	OMNI WAN planner (14B i2v→v2v, `combined_4env`), DiT-only bf16 (~33 GB). Pull `Wan-AI/Wan2.1-I2V-14B-480P` base upstream.
`idm-mimicgen/`	MimicGen IDM (`x21o0cwe`) — the OMNI-default pairing.
`idm-droid/`	DROID FR3 IDM (`7wohna95`, SE3-delta 7-DOF).

Downloads last month: -; Downloads are not tracked for this model. How to track

Video Preview

Robotics