Instructions to use agibot-world/Genie-Envisioner-Sim-v2.0 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use agibot-world/Genie-Envisioner-Sim-v2.0 with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("agibot-world/Genie-Envisioner-Sim-v2.0", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
GE-Sim 2.0: A Roadmap Towards Comprehensive Closed-loop Video World Simulators for Robotic Manipulation
Introduction
This repository provides the pretrained weights of GE-Sim 2.0 (Genie Envisioner World Simulator 2.0), a closed-loop video world simulator for robotic manipulation.
GE-Sim 2.0 builds on the action-conditioned video generation framework of Genie Envisioner and is retrained on large-scale real-world robot data spanning teleoperation, contact-rich interaction, and on-robot policy deployment. Given multi-view history frames and an action trajectory, the model generates multi-view future rollouts that follow the specified robot behavior, enabling scalable policy evaluation and closed-loop policy learning.
Please refer to our project page, arXiv paper, and GitHub repository for more details.
Released Weights
This repository provides the following checkpoints:
gesim_community_v2.0.1_g01op_distill_2B: The released GE-Sim 2.0 world simulator checkpoint. It is post-trained on Genie-01(G01) + OmniPicker(OP) data, building on internal pretrained weights, and distilled into a 2B-parameter few-step model for efficient closed-loop rollout.pi05_gesim_g01op_test: A pi05 policy checkpoint fine-tuned on demo tasks, provided for demo-task policy rollout and closed-loop evaluation with the released world simulator.
Model Highlights
- Action-conditioned multi-view world simulation: GE-Sim 2.0 generates robot rollouts from visual history and calibrated action trajectories, supporting head-view and wrist-view observations for embodied manipulation.
- Large-scale real-world training data: The simulator is trained with diverse robot episodes from teleoperation, real-world deployment, rich object interaction, successful executions, and failure trajectories, improving trajectory coverage and reducing hallucinated interactions.
- Proprioceptive state expert: A lightweight state expert decodes dual-arm joint angles and gripper states from video latents, providing downstream policy models with both visual and proprioceptive observations for next-chunk prediction.
- World Judge (coming soon): The paper introduces a VLM-based World Judge that scores generated rollouts against task instructions.
- Efficient rollout: The acceleration framework generates a 25-frame rollout in 2.3 seconds on a single H100 and supports up to 4x frame skipping for faster evaluation.
- WorldArena CVPR Challenge champion: GE-Sim 2.0 won the WorldArena CVPR Challenge with a 2B-parameter model, outperforming both robotic world models and closed-source general video generators reported in the paper.
Intended Use
GE-Sim 2.0 is intended for research on robotic manipulation world models, policy evaluation, closed-loop simulation, and world-model-based policy learning. It can be used to generate action-conditioned rollouts and evaluate policy behavior in a learned world.
Citation
If you find GE-Sim 2.0 useful for your research, please consider starring the GitHub repository and citing our paper:
@article{qiu2026gesim2,
title={GE-Sim 2.0: A Roadmap Towards Comprehensive Closed-loop Video World Simulators for Robotic Manipulation},
author={Qiu, Boxiang and Chen, Liliang and Liao, Yue and Wang, Nan and Wang, Lintao and Luo, Jiayi and Zhao, Wenzhi and Chen, Shengcong and Chen, Di and Li, Ye and Gao, Chen and Yan, Shuicheng and Liu, Si and Yao, Maoqing and Ren, Guanghui},
journal={arXiv preprint arXiv:2605.27491},
year={2026}
}
License
Codes adapted from upstream projects such as Diffusers and Cosmos are released under Apache License 2.0.
The pi05 policy is served through openpi (Physical Intelligence), included as a submodule under third_party/openpi and licensed under Apache License 2.0. Because pi05 builds on PaliGemma, its use is additionally subject to the Gemma Terms of Use (see third_party/openpi/LICENSE_GEMMA.txt).
Other data and codes within this repo are under CC BY-NC-SA 4.0.
- Downloads last month
- -
