tau-0-wm — UR3e closebox fine-tune (action branch)

Fine-tune of sii-research/tau-0-wm (τ0-WM, a Wan2.2-TI2V-5B based video-action world model) on the UR3e dual-arm closebox task (closebox_all3_lossless, 417 episodes, "close the box").

What was trained

  • Action branch flow-matching fine-tune. The video backbone, VAE and T5 are frozen; the ~0.5B action stream (action_* modules) is trained. Objective: rectified flow / flow-matching MSE on the 20-dim relative-EEF-6D action chunk (v = ε − x0, σ = t/1000), matching the deployment pipeline.infer action path and the paper (arXiv:2606.01027, Eq. 2).
  • Action space: relative end-effector 6D, 20-dim = per-arm [rel_xyz(3) + rel_rot6d(6) + gripper(1)], built via utils.action_space_utils.abs_eef_to_rela.
  • Normalization: closebox-specific stats (statistics_closebox_all3.json), openpi RunningStats convention, over the relative-EEF-6D targets and 20-dim states.

Training config

  • 8× A100 80GB, DDP. Per-GPU batch 4 × 8 = global batch 32.
  • AdamW lr 5e-5 (warmup 1000 + cosine), weight decay 1e-4, grad-clip 1.0.
  • 12 epochs ≈ 110k steps. Checkpoints every ~5.5k steps.

Files

  • checkpoints/ckpt_XXXXXX.pt{"step", "model"}; model is the FULL WanModel state dict (bf16, ~11GB) i.e. base backbone + fine-tuned action stream, directly loadable for deployment.
  • wan_pretrain_rela_eef6d.yaml — model/inference config.
  • statistics_closebox_all3.json — action/state mean-std used for (de)normalization at inference.

Use

Load the state dict into the tau-0-wm WanModel (see the tau-0-wm repo) and deploy via TauPolicy, pointing the statistics file at statistics_closebox_all3.json.

Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading

Model tree for dachiiA81/tau0-closebox-all3-ft

Finetuned
(2)
this model

Paper for dachiiA81/tau0-closebox-all3-ft