Instructions to use atharva-pantheon/MolmoAct2-BimanualYAM-stackcube with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- LeRobot
How to use atharva-pantheon/MolmoAct2-BimanualYAM-stackcube with LeRobot:
- Notebooks
- Google Colab
- Kaggle
MolmoAct2 β Bimanual YAM cube-stacking (action-expert, chunk_size=30)
Fine-tune of allenai/MolmoAct2-BimanualYAM
on the atharva-pantheon/yam-stack-cube dataset (44 bimanual-YAM demos, 3 cameras, 14-dim
absolute joint pose, 10 fps). Task instruction: "stack the cubes".
Trained with LeRobot (molmoact2-policy branch) for a smooth, RL-ready continuous policy.
Recipe
- Action-expert-only, continuous (VLM frozen; ~578M trainable / 5.4B total). No LoRA on the action expert.
chunk_size=30,n_action_steps=30(3 s lookahead @ 10 fps β long, smooth motion).setup_type="bimanual yam robotic arms in molmoact2",control_mode="absolute joint pose".- bf16, 8 flow timesteps, action-expert lr 5e-5, cosine schedule (200 warmup), batch_size 8.
- Normalization reused from the base checkpoint's
yam_dual_molmoact2tag (not recomputed on the 44-demo set, whose joint range is much narrower) for scale-consistent, smooth actions.
Checkpoints
20k_run/checkpoint_020000/β 20,000-step run, final loss β 0.009.40k_run/checkpoint_005000β¦checkpoint_040000/β 40,000-step run, every 5k steps.
Each folder is a LeRobot pretrained_model (weights + pre/post processors). Pick the best
checkpoint on the physical robot (no simulator for YAM). Evaluation is on hardware.
Training loss (20k run)
Flow-matching loss decays smoothly 0.093 β 0.009 (log scale), flattening after ~15k steps;
no instability/spikes. The 40k-run curve is added under assets/ when that run completes.
Model tree for atharva-pantheon/MolmoAct2-BimanualYAM-stackcube
Base model
allenai/MolmoAct2-BimanualYAM