UnitreeG1_putawaytoolsV2_rndchnk_4000step β€” LingBot-VA G1 post-trained transformer

Fine-tuned transformer for LingBot-VA on Unitree G1 (Dex1), task XiaoweiLinXL/pi05-unitree-g1-put-away-tools-v2.1 (cleaner recollection of the original put_away_tools task): "Put the battery on the shelf labeled 'battery' and put the screwdriver on the shelf labeled 'Philips'."

  • Base: robbyant/lingbot-va-base
  • Post-training: 70 demos (43,851 frames) β€” cleaner than the v1 release (48 demos / 24,271 frames). Single-task, lr 1e-5, FDM v2 recipe β€” mutually-exclusive per-microstep regime (rank-synced coin fdm_prob=0.5: FDM video-only L_fdm Eq.13 lambda_fdm=1.0 OR standard IDM L_dyn+L_inv; one forward, one backward). Per-step randomized chunk_size ∈ {1,2,3,4} and window_size ∈ {4..64} so the deployed model handles any chunk/window setting at inference (the "rndchnk" in the repo name).
  • 4 GPUs Γ— grad_accum=4 = effective batch 16, optimizer step 4000 of a 5000-step schedule.
  • Action range under quantile normalization: Β±1.26 (vs the v1 data's Β±2.0 β€” the cleaner demos hug the q01/q99 quantile range much more tightly, removing the out-of-range tail that hurt v1 deployment).
  • This repo contains only transformer/ β€” vae/, text_encoder/, tokenizer/ are unchanged from robbyant/lingbot-va-base.

Assemble an eval-ready checkpoint

hf download robbyant/lingbot-va-base                              --local-dir lingbot-va-base
hf download EmbodyX/UnitreeG1_putawaytoolsV2_rndchnk_4000step      --local-dir g1_pat_v2_4000_dl

mkdir -p g1_pat_v2_4000
ln -sf $(realpath g1_pat_v2_4000_dl/transformer)  g1_pat_v2_4000/transformer
ln -sf $(realpath lingbot-va-base/vae)            g1_pat_v2_4000/vae
ln -sf $(realpath lingbot-va-base/text_encoder)   g1_pat_v2_4000/text_encoder
ln -sf $(realpath lingbot-va-base/tokenizer)      g1_pat_v2_4000/tokenizer

Serve with CONFIG_NAME=g1_putawaytools_v21 MODEL_PATH=g1_pat_v2_4000. transformer/config.json has attn_mode: torch (inference-ready).

Downloads last month
-
Video Preview
loading