VERA checkpoints
Hosted weights for VERA — Turning Video Models into Generalist Robot Policies — a two-stage system: a video planner that "dreams" the future, and a Jacobian inverse-dynamics model (IDM) that turns the dream into robot actions. VERA hosts only the trained artifacts; frozen upstream pieces (Wan2.1 text-enc/VAE, VGGT) are pulled from their original homes and filtered out at inference.
Code + usage: see the VERA GitHub repo.
Release groups
1 · Panda-sim (MimicGen)
| dir | what |
|---|---|
mimicgen-wan-1.3b/ |
MimicGen-specialist WAN video planner (1.3B, t2v→v2v), DiT-only bf16 (~2.8 GB, de-bloated from 17.5 GB) + flow_decoder.ckpt + algo_config.yaml. Pull Wan-AI/Wan2.1-T2V-1.3B base upstream. |
idm-mimicgen-37oa162u/ |
the proven-recipe Jacobian IDM (run 37oa162u) behind the 83–94% stack_d0 result. |
Pair these two + cotracker + the gated/adaptive gripper recipe to reproduce
stack_d0.
2 · PushT — reproduces the two-stage DFoT→Jacobian solve
| dir | what |
|---|---|
pusht-dfot/ |
DFoT video planner (run dvxixf6d) — a U-Net3D flow predictor (~2.4M params), not WAN. + run_config.yaml. |
pusht-idm/ |
PushT Jacobian IDM (run j1j59qzz, ~34.8M params, 2-DOF) + config.yaml. |
3 · OMNI — the cross-embodiment WAN planner
| dir | what |
|---|---|
omni-wan/ |
OMNI WAN planner (14B i2v→v2v, combined_4env), DiT-only bf16 (~33 GB). Pull Wan-AI/Wan2.1-I2V-14B-480P base upstream. |
idm-mimicgen/ |
MimicGen IDM (x21o0cwe) — the OMNI-default pairing. |
idm-droid/ |
DROID FR3 IDM (7wohna95, SE3-delta 7-DOF). |