YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
Maniskill_gen_new β Data Collection for Compositional Generalization
This repository contains the data collection codebase used to generate training data for studying compositional language grounding in robotic manipulation. It is built on top of ManiSkill and provides a suite of pairwise conflict experiments designed to measure which linguistic factors a policy prioritizes when given conflicting instructions.
Design Overview
Core Concept: Pairwise Conflict Experiments
Each experiment places two linguistic factors in conflict. For example, in verb_size, the instruction might be "Push the smaller cube" but the verb (push) and size (smaller) cue different target objects. The robot must pick one factor to follow. By varying which factor is "seen" during training (via the data collection strategy), we can bias the policy toward different grounding behaviors.
The custom ManiSkill environment VerbObjectColor-v1 (in mani_skill/envs/tasks/tabletop/verb_object_color_env.py) implements all 10 pairwise experiments with a 6-factor vocabulary:
| Factor | Values |
|---|---|
| Verb | lift, grasp, push, pull, rotate, slide |
| Color | red, yellow, blue, orange, green, black |
| Shape | cube, sphere, cup, car, pyramid, star |
| Spatial | left, right, middle, front, behind |
| Size | small, large, smaller, larger, smallest, largest |
10 Pairwise Experiment Types
| Experiment | Factor 1 | Factor 2 | Fixed |
|---|---|---|---|
verb_color |
verb | color | shape=cube |
verb_object |
verb | shape | color=red |
color_object |
color | shape | verb=lift |
verb_size |
verb | size | color=red, shape=cube |
verb_spatial |
verb | spatial | color=red, shape=cube |
color_size |
color | size | verb=lift, shape=cube |
color_spatial |
color | spatial | verb=lift, shape=cube |
spatial_size |
spatial | size | verb=lift, color=red, shape=cube |
size_object |
size | shape | verb=lift, color=red |
spatial_object |
spatial | shape | verb=lift, color=red |
The factor grid is a 6Γ6 matrix. Training coverage is controlled by --factor-count in {6, 12, 18}, selecting that many (factor1_idx, factor2_idx) cells to include in training demos.
All-Factor Experiment
collection_strategy/all_factor/ implements the full 5-factor experiment: 6 verbs Γ 6 colors Γ 6 shapes Γ 5 spatials Γ 4 sizes = 4320 cells (sampled via f50 strategies covering 50 cells).
Data Collection Strategies
Three families of strategies control which cells in the factor grid are included in training data:
Stair (Ours)
A staircase pattern that systematically traverses the diagonal of the factor grid. Each new cell maximally increases one factor's breadth while maintaining full coverage of the other. This provides structured curriculum-style coverage with high compositional exposure per episode.
stair(Maniskill_gen_new convention, verb-first): diagonal staircasestair1: alternative anchor (shape/color-first anchor)
L-Random (Lrandom)
A hybrid strategy combining a fixed L-shaped spine (high-information anchor cells) with random fills. Provides structured core coverage with randomized diversity.
Random
Uniform random sampling of cells from the 6Γ6 grid. Baseline strategy.
StairRandom
Staircase cells combined with additional random cells. Experimental hybrid.
Repository Structure
collection_strategy/ <- Core data collection code
βββ lib/
β βββ pairwise_factor_patterns.py # STAIR, LRANDOM, L1 index definitions
β βββ pairwise_strategies.py # Strategy type enum
β βββ pairwise_task_language.py # Instruction string generation
β βββ all_factor_support_f50.py # All-factor f50 strategy definitions
β βββ all_factor_scene.py # All-factor scene setup
βββ collect_pairwise_attribute.py # Main collector: verb_size, verb_spatial,
β # color_size, spatial_size
βββ collect_pairwise.py # Legacy: verb_color, verb_object, color_object
βββ collect_verb_color_object.py # VerbObjectColor triple-factor collector
βββ collect_all_factor.py # All-5-factor collector
βββ convert_maniskill_to_lerobot.py # Convert H5 demos -> LeRobot dataset
βββ build_attribute_task_map.py # Build task language map from filenames
βββ {verb_color, verb_object, color_object, ...}/
β βββ collect.py # Thin wrapper calling main collector
β βββ collect_convert.sh # Full pipeline: collect -> replay -> LeRobot
βββ all_factor/
βββ collect.py
βββ collect_convert.sh
conflict_experiment/ <- Conflict-eval specific collection
βββ collect_conflict.py # Conflict eval data collector
βββ collect_conflict_attribute.py # Attribute-version collector
βββ lib/conflict_sampling.py # Conflict pair sampling logic
scripts/
βββ run_verb_color_shape_motion_planning.py # Core motion planning runner
βββ ... # Utility scripts
asset/ <- Custom 3D object meshes
βββ cup.obj # Cup mesh
βββ car.obj # Car mesh
βββ can.obj # Can mesh
mani_skill/ <- Modified ManiSkill framework
βββ envs/tasks/tabletop/
βββ verb_object_color_env.py # Custom VerbObjectColor-v1 environment
Pipeline
Each experiment follows a 3-stage pipeline:
1. COLLECT -> Motion-planning demos saved as .h5 (no RGB obs)
2. REPLAY -> Replay trajectories with RGB rendering -> *_rgb.h5
3. CONVERT -> Convert to LeRobot parquet format -> HuggingFace dataset
The collect_convert.sh in each experiment folder automates all three stages.
Quick Start
cd /path/to/Maniskill_gen_new
# Collect 200 demos for verb_size with Staircase-f18 strategy
STRATEGY=stair FACTOR_COUNT=18 NUM_DEMOS=200 \
bash collection_strategy/verb_size/collect_convert.sh /path/to/output
# Collect only (no replay/convert):
python -m collection_strategy.collect_pairwise_attribute \
--experiment verb_size \
--strategy stair \
--factor-count 18 \
--num-demos 200 \
--record-base /path/to/output
All-Factor Collection
STRATEGY=Lrandom_pure NUM_DEMOS=800 \
bash collection_strategy/all_factor/collect_convert.sh /path/to/output
Factor Grid Convention (Maniskill_gen_new)
Index convention for pairwise experiments:
| Experiment | Row (factor1_idx) | Col (factor2_idx) |
|---|---|---|
verb_color |
color_idx | verb_idx |
verb_object |
shape_idx | verb_idx |
color_object |
color_idx | shape_idx |
verb_size |
size_idx | verb_idx |
verb_spatial |
spatial_idx | verb_idx |
color_size |
color_idx | size_idx |
spatial_size |
spatial_idx | size_idx |
Factor counts (Maniskill_gen_new convention):
stair f6= 5 cells,f12= 10 cells,f18= 15 cellsLrandom f6= 4 cells,f12= 9 cells,f18= 14 cellsrandom f6= 3 cells (2x2 subgrid),f12= 6 cells,f18= 9 cells
Setup
# Install ManiSkill (modified version)
conda create -n maniskill39 python=3.9
conda activate maniskill39
pip install -e .
# For LeRobot conversion
conda activate openpi # or any env with lerobot
pip install lerobot
Related
- Evaluation code: yqi19/evaluation_pi0_pi05
- Training datasets: yqi19/data_05_17
- ManiSkill: haosulab/ManiSkill