World Pilot: Steering Vision-Language-Action Models with World-Action Priors

This repository hosts the released WorldPilot checkpoint for the LIBERO setting from the open-source WorldPilot project.

Resources:

Overview

WorldPilot steers a vision-language-action policy with priors from a world-action model. This release provides the public LIBERO checkpoint and its accompanying dataset statistics file for use with the public WorldPilot codebase.

Included Files

checkpoints/steps_50000_pytorch_model.pt
dataset_statistics.json
  • checkpoints/steps_50000_pytorch_model.pt: released WorldPilot checkpoint.
  • dataset_statistics.json: statistic file saved with the WorldPilot run.

Intended Use

This release is intended for:

  1. loading the released WorldPilot checkpoint with the public WorldPilot codebase;
  2. using the accompanying dataset statistics file expected by the public WorldPilot code;
  3. reproducing or evaluating the released LIBERO setting from the main project documentation.

This repository is not a standalone full dependency bundle. It does not include training configs, run scripts, logs, source snapshots, upstream checkpoints, datasets, or third-party dependencies. Public training and evaluation still require the upstream assets and environments documented in the main project repository.

Using This Checkpoint

Download this repository and pass checkpoints/steps_50000_pytorch_model.pt and dataset_statistics.json to the relevant configuration fields in the public WorldPilot codebase.

For the full setup, follow the public project documentation:

Training Reference

For active training, use the checked-out public repository and its current training docs. Training configuration and launch scripts are maintained in the code repository, not bundled in this model release.

Training also depends on the upstream assets listed in the public docs, including:

  • nvidia/Cosmos-Policy-LIBERO-Predict2-2B
  • facebook/VGGT-1B
  • StarVLA/Qwen3-VL-4B-Instruct-Action
  • amap_cvlab/ABot-M0-Pretrain

Training reads precomputed Cosmos cache from:

datasets.vla_data.cosmos_cache_dir

You can reuse the published cache here:

Provenance and Terms

The main WorldPilot code repository is released under Apache-2.0:

This release also depends on upstream projects and assets that keep their own licenses and usage terms. Upstream checkpoints, datasets, and third-party components are not relicensed by this model card.

Citation

If WorldPilot helps your research, we would appreciate a citation using the BibTeX entry below.

@article{worldpilot2026,
  title={World Pilot: Steering Vision-Language-Action Models with World-Action Priors},
  author={Zefu Lin and Rongxu Cui and Junjia Xu and Xiaojuan Jin and Wenling Li and Lue Fan and Zhaoxiang Zhang},
  journal={Coming Soon.},
  year={2026}
}

Acknowledgements

We sincerely thank the teams behind ABot-Manipulation, cosmos-policy, LIBERO, LIBERO-plus, LeRobot for their outstanding work.

Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading