Genie-Envisioner Inference — ManiSkill Conflict Experiments

This repository contains the inference code and environment for evaluating Genie-Envisioner (GE-Act) on OOD conflict experiments in the ManiSkill simulation framework.

In a conflict experiment the robot receives an instruction that names two different object attributes (e.g., "Lift the red cube"), but the scene contains two objects that each satisfy only one attribute (one is red but not a cube; the other is a cube but not red). By recording which object the robot lifts across many such trials we can measure the model's Factor Dominance Rate (FDR) — a behavioural bias metric for language-conditioned robot manipulation.

Repository Structure

genie-inference-maniskill/
├── genie_envisioner/               # GE-Act inference code
│   ├── models/                     # MVActorModel architecture
│   ├── runner/                     # Inference runner (rollout loop)
│   ├── utils/                      # Shared utilities
│   ├── configs/
│   │   └── ltx_model/conflict/     # Per-experiment configs + action stats
│   ├── conflict_main.py            # Main rollout script (single pair or batch)
│   ├── run_ood_experiment_inference.sh  # Batch OOD evaluation script
│   ├── setup_maniskill_env.sh      # Conda environment setup
│   ├── requirements.txt            # Python dependencies
│   └── eval_conflict.md            # Detailed evaluation guide
│
└── maniskill_conflict/             # ManiSkill conflict environment
    ├── mani_skill/                 # Modified ManiSkill package
    │   ├── envs/tasks/             # VerbObjectColor-v1 conflict task
    │   └── assets/                 # Robot and scene assets
    ├── conflict_experiment/        # Experiment utilities (pair generation, etc.)
    ├── setup.py
    └── pyproject.toml

Quick Start

1. Clone this repository

git clone https://huggingface.co/yqi19/genie-inference-maniskill
cd genie-inference-maniskill

2. Set up the conda environment

bash genie_envisioner/setup_maniskill_env.sh
conda activate genie_envisioner

3. Download LTX-Video (required backbone)

GE-Act uses LTX-Video as its video generation backbone:

git clone https://huggingface.co/Lightricks/LTX-Video /path/to/LTX-Video

4. Obtain a GE-Act checkpoint

Checkpoints are structured as <experiment>/step_<N>/ directories containing config.json and diffusion_pytorch_model.safetensors. For example:

checkpoints/
└── color_object/
    └── step_30000/
        ├── config.json
        └── diffusion_pytorch_model.safetensors

5. Run an OOD conflict evaluation

WEIGHT=/path/to/checkpoints/color_object/step_30000 \
LTX_MODEL=/path/to/LTX-Video \
conda run -n genie_envisioner \
    bash genie_envisioner/run_ood_experiment_inference.sh \
        color_object \
        42 \
        200 \
        results/color_object_ood.txt

Supported Experiments

Experiment	Factor A	Factor B	Description
`color_object`	color	shape	Red object vs. cube — which does the model lift?
`color_size`	color	size	Coloured vs. sized object
`color_spatial`	color	spatial position	Coloured vs. positioned object
`size_object`	size	shape	Sized vs. shaped object
`spatial_object`	spatial position	shape	Positioned vs. shaped object
`spatial_size`	spatial position	size	Positioned vs. sized object
`verb_color`	verb	color	Verb-defined vs. coloured target
`verb_object`	verb	shape	Verb-defined vs. shaped target
`verb_size`	verb	size	Verb-defined vs. sized target
`verb_spatial`	verb	spatial position	Verb-defined vs. positioned target

Factor Dominance Rate (FDR)

FDR measures how strongly a model is biased toward one factor over another:

FDR(f1, f2) = (S_f1 - S_f2) / (S_f1 + S_f2 + ε)  ∈ [-1, +1]

where S_f1 and S_f2 are success rates on factor-1-instruction runs and factor-2-instruction runs respectively, and ε is a small constant for numerical stability.

A positive FDR indicates f1 dominance; negative indicates f2 dominance; 0 indicates no bias.

Environment Details

VerbObjectColor-v1

The conflict environment (mani_skill/envs/tasks/) is a modified version of ManiSkill's tabletop manipulation task. Key properties:

Two objects placed at fixed or randomised positions; each satisfies one factor
Language instruction generated from the experiment's factor pair
Success criterion: robot lifts the target object above a threshold height
Dual success tracking: separate success signals for factor-A-target and factor-B-target objects per episode

Observation space

agent/qpos, agent/qvel — proprioceptive joint state
sensor_data/base_camera/rgb — 256×256 RGB camera image
sensor_data/base_camera/depth — depth image
Language instruction string

Action space

8-dimensional joint position control (7 DOF robot arm joints + gripper).

Running Multiple Experiments

WEIGHT_ROOT=/path/to/checkpoints
LTX_MODEL=/path/to/LTX-Video

for EXP in color_object color_size color_spatial size_object spatial_object \
           spatial_size verb_color verb_object verb_size verb_spatial; do
    WEIGHT="${WEIGHT_ROOT}/${EXP}/step_30000" \
    LTX_MODEL="${LTX_MODEL}" \
    conda run -n genie_envisioner \
        bash genie_envisioner/run_ood_experiment_inference.sh \
            "${EXP}" 42 200 "results/genie_${EXP}_seed42.txt"
done

Detailed Documentation

See genie_envisioner/eval_conflict.md for:

Full environment variable reference
Single-pair debugging commands
Output file format description
Manual checkpoint loading example
Config file overview
GPU memory requirements

Requirements

Python 3.10
CUDA 12.4 compatible GPU (≥16 GB VRAM recommended; RTX 4090 tested)
Conda
~15 GB disk space for dependencies + LTX-Video backbone

Key Python packages:

torch==2.6.0+cu124
diffusers==0.32.0
transformers==4.51.3
safetensors==0.6.2
mani_skill (from maniskill_conflict/ in this repo)

Citation

If you use this code or the conflict experiment framework, please cite:

@inproceedings{genie_envisioner,
  title     = {Genie-Envisioner: ...},
  author    = {...},
  booktitle = {...},
  year      = {2025},
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support