YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Genie-Envisioner Inference β€” ManiSkill Conflict Experiments

This repository contains the inference code and environment for evaluating Genie-Envisioner (GE-Act) on OOD conflict experiments in the ManiSkill simulation framework.

In a conflict experiment the robot receives an instruction that names two different object attributes (e.g., "Lift the red cube"), but the scene contains two objects that each satisfy only one attribute (one is red but not a cube; the other is a cube but not red). By recording which object the robot lifts across many such trials we can measure the model's Factor Dominance Rate (FDR) β€” a behavioural bias metric for language-conditioned robot manipulation.


Repository Structure

genie-inference-maniskill/
β”œβ”€β”€ genie_envisioner/               # GE-Act inference code
β”‚   β”œβ”€β”€ models/                     # MVActorModel architecture
β”‚   β”œβ”€β”€ runner/                     # Inference runner (rollout loop)
β”‚   β”œβ”€β”€ utils/                      # Shared utilities
β”‚   β”œβ”€β”€ configs/
β”‚   β”‚   └── ltx_model/conflict/     # Per-experiment configs + action stats
β”‚   β”œβ”€β”€ conflict_main.py            # Main rollout script (single pair or batch)
β”‚   β”œβ”€β”€ run_ood_experiment_inference.sh  # Batch OOD evaluation script
β”‚   β”œβ”€β”€ setup_maniskill_env.sh      # Conda environment setup
β”‚   β”œβ”€β”€ requirements.txt            # Python dependencies
β”‚   └── eval_conflict.md            # Detailed evaluation guide
β”‚
└── maniskill_conflict/             # ManiSkill conflict environment
    β”œβ”€β”€ mani_skill/                 # Modified ManiSkill package
    β”‚   β”œβ”€β”€ envs/tasks/             # VerbObjectColor-v1 conflict task
    β”‚   └── assets/                 # Robot and scene assets
    β”œβ”€β”€ conflict_experiment/        # Experiment utilities (pair generation, etc.)
    β”œβ”€β”€ setup.py
    └── pyproject.toml

Quick Start

1. Clone this repository

git clone https://huggingface.co/yqi19/genie-inference-maniskill
cd genie-inference-maniskill

2. Set up the conda environment

bash genie_envisioner/setup_maniskill_env.sh
conda activate genie_envisioner

3. Download LTX-Video (required backbone)

GE-Act uses LTX-Video as its video generation backbone:

git clone https://huggingface.co/Lightricks/LTX-Video /path/to/LTX-Video

4. Obtain a GE-Act checkpoint

Checkpoints are structured as <experiment>/step_<N>/ directories containing config.json and diffusion_pytorch_model.safetensors. For example:

checkpoints/
└── color_object/
    └── step_30000/
        β”œβ”€β”€ config.json
        └── diffusion_pytorch_model.safetensors

5. Run an OOD conflict evaluation

WEIGHT=/path/to/checkpoints/color_object/step_30000 \
LTX_MODEL=/path/to/LTX-Video \
conda run -n genie_envisioner \
    bash genie_envisioner/run_ood_experiment_inference.sh \
        color_object \
        42 \
        200 \
        results/color_object_ood.txt

Supported Experiments

Experiment Factor A Factor B Description
color_object color shape Red object vs. cube β€” which does the model lift?
color_size color size Coloured vs. sized object
color_spatial color spatial position Coloured vs. positioned object
size_object size shape Sized vs. shaped object
spatial_object spatial position shape Positioned vs. shaped object
spatial_size spatial position size Positioned vs. sized object
verb_color verb color Verb-defined vs. coloured target
verb_object verb shape Verb-defined vs. shaped target
verb_size verb size Verb-defined vs. sized target
verb_spatial verb spatial position Verb-defined vs. positioned target

Factor Dominance Rate (FDR)

FDR measures how strongly a model is biased toward one factor over another:

FDR(f1, f2) = (S_f1 - S_f2) / (S_f1 + S_f2 + Ρ)  ∈ [-1, +1]

where S_f1 and S_f2 are success rates on factor-1-instruction runs and factor-2-instruction runs respectively, and Ξ΅ is a small constant for numerical stability.

A positive FDR indicates f1 dominance; negative indicates f2 dominance; 0 indicates no bias.


Environment Details

VerbObjectColor-v1

The conflict environment (mani_skill/envs/tasks/) is a modified version of ManiSkill's tabletop manipulation task. Key properties:

  • Two objects placed at fixed or randomised positions; each satisfies one factor
  • Language instruction generated from the experiment's factor pair
  • Success criterion: robot lifts the target object above a threshold height
  • Dual success tracking: separate success signals for factor-A-target and factor-B-target objects per episode

Observation space

  • agent/qpos, agent/qvel β€” proprioceptive joint state
  • sensor_data/base_camera/rgb β€” 256Γ—256 RGB camera image
  • sensor_data/base_camera/depth β€” depth image
  • Language instruction string

Action space

8-dimensional joint position control (7 DOF robot arm joints + gripper).


Running Multiple Experiments

WEIGHT_ROOT=/path/to/checkpoints
LTX_MODEL=/path/to/LTX-Video

for EXP in color_object color_size color_spatial size_object spatial_object \
           spatial_size verb_color verb_object verb_size verb_spatial; do
    WEIGHT="${WEIGHT_ROOT}/${EXP}/step_30000" \
    LTX_MODEL="${LTX_MODEL}" \
    conda run -n genie_envisioner \
        bash genie_envisioner/run_ood_experiment_inference.sh \
            "${EXP}" 42 200 "results/genie_${EXP}_seed42.txt"
done

Detailed Documentation

See genie_envisioner/eval_conflict.md for:

  • Full environment variable reference
  • Single-pair debugging commands
  • Output file format description
  • Manual checkpoint loading example
  • Config file overview
  • GPU memory requirements

Requirements

  • Python 3.10
  • CUDA 12.4 compatible GPU (β‰₯16 GB VRAM recommended; RTX 4090 tested)
  • Conda
  • ~15 GB disk space for dependencies + LTX-Video backbone

Key Python packages:

  • torch==2.6.0+cu124
  • diffusers==0.32.0
  • transformers==4.51.3
  • safetensors==0.6.2
  • mani_skill (from maniskill_conflict/ in this repo)

Citation

If you use this code or the conflict experiment framework, please cite:

@inproceedings{genie_envisioner,
  title     = {Genie-Envisioner: ...},
  author    = {...},
  booktitle = {...},
  year      = {2025},
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support