Instructions to use EmbodyX/UnitreeG1_putawaytoolsV2_rndchnk_500step with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use EmbodyX/UnitreeG1_putawaytoolsV2_rndchnk_500step with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("EmbodyX/UnitreeG1_putawaytoolsV2_rndchnk_500step", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
UnitreeG1_putawaytoolsV2_rndchnk_500step β LingBot-VA G1 post-trained transformer
Fine-tuned transformer for LingBot-VA on Unitree G1 (Dex1), task
XiaoweiLinXL/pi05-unitree-g1-put-away-tools-v2.1 (cleaner recollection
of the original put_away_tools task):
"Put the battery on the shelf labeled 'battery' and put the screwdriver
on the shelf labeled 'Philips'."
- Base:
robbyant/lingbot-va-base - Post-training: 70 demos (43,851 frames) β cleaner than v1 (48 demos).
Single-task, lr 1e-5, FDM v2 recipe β mutually-exclusive per-microstep
regime (rank-synced coin
fdm_prob=0.5: FDM video-only L_fdm Eq.13lambda_fdm=1.0OR standard IDM L_dyn+L_inv; one forward, one backward). Per-step randomized chunk_size β {1,2,3,4} and window_size β {4..64} (the "rndchnk" in the repo name). - 4 GPUs Γ
grad_accum=4= effective batch 16, optimizer step 500 of a 5000-step schedule (very early β uploaded specifically to A/B test the overfitting hypothesis: later checkpoints converged to suspiciously low loss [vid=0.0072 at step 5000, vs cup_broccoli baseline 0.052]; this 500 ckpt is still at vid=0.130, comparable to the working v1 step-500). - Action range under quantile normalization: Β±1.26 (vs v1's Β±2.0 β v21 has a much tighter quantile distribution).
- This repo contains only
transformer/βvae/,text_encoder/,tokenizer/are unchanged fromrobbyant/lingbot-va-base.
A/B testing context
The v21 5000-step checkpoint (EmbodyX/UnitreeG1_putawaytoolsV2_rndchnk_5000step)
underperforms the matching-recipe v1 checkpoint
(EmbodyX/g1_putawaytools_rndchnk_4000step) at deployment, despite v21
having more demos + cleaner data + lower training loss. This 500-step
checkpoint exists to test whether late-stage overfitting on v21's narrower
action distribution is the cause.
Assemble an eval-ready checkpoint
hf download robbyant/lingbot-va-base --local-dir lingbot-va-base
hf download EmbodyX/UnitreeG1_putawaytoolsV2_rndchnk_500step --local-dir g1_pat_v2_500_dl
mkdir -p g1_pat_v2_500
ln -sf $(realpath g1_pat_v2_500_dl/transformer) g1_pat_v2_500/transformer
ln -sf $(realpath lingbot-va-base/vae) g1_pat_v2_500/vae
ln -sf $(realpath lingbot-va-base/text_encoder) g1_pat_v2_500/text_encoder
ln -sf $(realpath lingbot-va-base/tokenizer) g1_pat_v2_500/tokenizer
Serve with CONFIG_NAME=g1_putawaytools_v21 MODEL_PATH=g1_pat_v2_500.
transformer/config.json has attn_mode: torch (inference-ready).
- Downloads last month
- -