Vision: SO-ARM 101 Toy-Sorting Pipeline
End Goal
Train a manipulation policy that picks up colored toy objects and drops them into matching colored trays, using imitation learning from teleoperated demos recorded in Isaac Sim.
Pipeline
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β 1. Simulate β
β Isaac Lab ToySortingEnv (Python 3.11 / Isaac Lab 2.3.2) β
β β’ Wooden table + SO-ARM 101 β
β β’ 3 colored trays (red | green | blue) β
β β’ 9 colored toys (3 per color, randomized each episode) β
βββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Phase 2: ZMQ REP :5555
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β 2. Collect Demos β
β LeRobot / SpaceMouse teleop client (Python 3.12) β
β β’ Sends joint targets to sim via ZMQ β
β β’ Streams obs/actions into LeRobot Dataset v3 format β
β β’ Pushes dataset to HuggingFace Hub β
βββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β 3. Augment Dataset (optional) β
β β’ Background swap, color jitter, domain randomization β
β β’ Re-label with reward signal for RL fine-tuning β
βββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β 4. Train Policy β
β lerobot train policy=act (or diffusion_policy) β
β β’ Loads dataset from HuggingFace Hub β
β β’ Saves checkpoint to Hub β
βββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β 5. Evaluate in Sim β
β β’ Roll out policy in ToySortingEnv β
β β’ Log success rate (sort 3 toys correctly in <60 s) β
β β’ Push eval metrics to HuggingFace Hub β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Container Architecture
ββββββββββββββββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββββββββββββ
β sim (isaac-lab:2.3.2, Python 3.11) β β train (python:3.12-slim + lerobot) β
β Isaac Lab ToySortingEnv βββββΊβ LeRobot training / data collection β
β ZMQ REP server :5555 (Phase 2) βZMQ β IsaacGymClient gymnasium wrapper β
β X11 GUI for visualization β β HuggingFace dataset push β
ββββββββββββββββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββββββββββββ
Phases
| Phase | Status | Description |
|---|---|---|
| 1 | Done | Isaac Lab env with real USD assets; X11 visualization |
| 2 | Scaffolded | ZMQ bridge + LeRobot demo collection client |
| 3 | Planned | Dataset augmentation pipeline |
| 4 | Planned | ACT / Diffusion Policy training with LeRobot |
| 5 | Planned | Closed-loop evaluation + HF metrics push |
Asset Strategy
Assets live outside git (large binary files). Two distribution paths:
- Developer machine:
python assets/download.py --extractcopies the needed USD files from the localLightwheel_Xx8T7EPOMd_KitchenRoom/pack. - Docker / CI:
python assets/download.py --downloadfetches the pre-extracted subset from HuggingFace Hub (HF_ASSET_REPOenv var).
Neither the git repo nor the Docker image contains asset files directly.
Success Metric (Phase 5 target)
Place all 9 toys into their correct color-matched tray within 60 seconds, measured over 50 random seeds. Target success rate β₯ 80 %.