AHA-WAM RoboTwin2.0 Checkpoint

This repository provides a RoboTwin2.0 checkpoint for AHA-WAM, together with the dataset normalization statistics required for reproducible evaluation.

AHA-WAM is a Wan2.2-based world-action model for robot policy learning. This checkpoint was trained with the robotwin_ahawam task configuration from the AHA-WAM codebase:

configs/task/robotwin_ahawam.yaml

The checkpoint is intended for direct loading, reproduction, and RoboTwin evaluation with the public AHA-WAM code release.

Repository Files

AHA-WAM-RoboTwin2.0/
β”œβ”€β”€ README.md
β”œβ”€β”€ dataset_stats.json
β”œβ”€β”€ robotwin_ahawam.pt
└── robotwin_ahawam-flash.pt
  • robotwin_ahawam.pt: AHA-WAM checkpoint trained on RoboTwin2.0-format data.
  • robotwin_ahawam-flash.pt: AHA-WAM-Flash checkpoint trained on RoboTwin2.0-format data, after ODE distillation from AHA-WAM.
  • dataset_stats.json: normalization statistics used by the AHA-WAM processor for action/state normalization and denormalization.
  • README.md: this model card.

Model Details

  • Model type: AHA-WAM robot world-action model
  • Backbone: Wan2.2-TI2V-5B components with an ActionDiT action branch
  • Training task config: configs/task/robotwin_ahawam.yaml
  • RoboTwin action dimension: 14
  • RoboTwin proprioception dimension: 14
  • Observation cameras: cam_high, cam_left_wrist, cam_right_wrist
  • Training/evaluation frame setup: 65-frame trajectories with action_video_freq_ratio=8
  • Default evaluation task config: RoboTwin demo_randomized
  • Default evaluation instruction split: unseen

Installation

First install the public AHA-WAM codebase:

# Clone the AHA-WAM GitHub repository, then enter the codebase.
cd AHA-WAM-release

conda create -n ahawam python=3.10 -y
conda activate ahawam
pip install -U pip
pip install torch==2.7.1+cu128 torchvision==0.22.1+cu128 \
  --extra-index-url https://download.pytorch.org/whl/cu128
pip install -e .
pip install huggingface_hub

AHA-WAM also uses Wan model assets. Put the Wan assets in your local model directory and point DiffSynth to them:

export DIFFSYNTH_MODEL_BASE_PATH=/path/to/wan_models

Please follow the AHA-WAM repository README for the full environment and RoboTwin setup.

Download

Download the checkpoint and normalization stats with huggingface_hub:

from huggingface_hub import hf_hub_download

repo_id = "SereneC/AHA-WAM-RoboTwin2.0"

ckpt_path = hf_hub_download(
    repo_id=repo_id,
    filename="robotwin_ahawam.pt",
    local_dir="checkpoints/AHA-WAM-RoboTwin2.0",
)
stats_path = hf_hub_download(
    repo_id=repo_id,
    filename="dataset_stats.json",
    local_dir="checkpoints/AHA-WAM-RoboTwin2.0",
)

print("checkpoint:", ckpt_path)
print("dataset stats:", stats_path)

Or use the CLI:

huggingface-cli download SereneC/AHA-WAM-RoboTwin2.0 \
  robotwin_ahawam.pt dataset_stats.json \
  --local-dir checkpoints/AHA-WAM-RoboTwin2.0

Load the Model

This checkpoint is not a transformers.from_pretrained() checkpoint. Load it with the AHA-WAM codebase by instantiating the Hydra model config and calling load_checkpoint().

import sys
from pathlib import Path

import torch
from hydra import compose, initialize_config_dir
from hydra.core.global_hydra import GlobalHydra
from hydra.utils import instantiate

PROJECT_ROOT = Path("/path/to/AHA-WAM-release").resolve()
sys.path.insert(0, str(PROJECT_ROOT))
sys.path.insert(0, str(PROJECT_ROOT / "src"))

ckpt_path = PROJECT_ROOT / "checkpoints/AHA-WAM-RoboTwin2.0/robotwin_ahawam.pt"

if GlobalHydra.instance().is_initialized():
    GlobalHydra.instance().clear()

with initialize_config_dir(
    version_base="1.3",
    config_dir=str(PROJECT_ROOT / "configs"),
):
    cfg = compose(
        config_name="sim_robotwin.yaml",
        overrides=[
            "task=robotwin_ahawam",
            "model.load_text_encoder=true",
            "model.skip_dit_load_from_pretrain=true",
            "model.action_dit_pretrained_path=null",
        ],
    )

device = "cuda"
model = instantiate(cfg.model, model_dtype=torch.bfloat16, device=device)
model.load_checkpoint(str(ckpt_path))
model = model.to(device).eval()

print("Loaded AHA-WAM checkpoint:", ckpt_path)

For RoboTwin evaluation, the policy adapter loads the same checkpoint internally, so most users can use the evaluation command below directly.

RoboTwin Evaluation

Install and prepare RoboTwin following the upstream RoboTwin instructions. Then point AHA-WAM to your RoboTwin checkout:

ROBOTWIN_ROOT=/path/to/RoboTwin
ln -sfn "$(pwd)/experiments/robotwin/ahawam_policy" \
  "$ROBOTWIN_ROOT/policy/ahawam_policy"

Run the multi-GPU RoboTwin evaluation manager:

python experiments/robotwin/run_robotwin_manager.py \
  task=robotwin_ahawam \
  ckpt=checkpoints/AHA-WAM-RoboTwin2.0/robotwin_ahawam.pt \
  EVALUATION.dataset_stats_path=checkpoints/AHA-WAM-RoboTwin2.0/dataset_stats.json \
  EVALUATION.robotwin_root=/path/to/RoboTwin \
  MULTIRUN.num_gpus=8

Useful overrides:

# Change the number of GPUs used by the evaluation manager.
MULTIRUN.num_gpus=<N>

# Select a specific RoboTwin task.
EVALUATION.task_name=<task_name>

# Change evaluation episodes.
EVALUATION.eval_num_episodes=<N>

Training Configuration

The released checkpoint corresponds to:

task: robotwin_ahawam
batch_size: 6
num_epochs: 5
learning_rate: 5e-5
weight_decay: 1e-2
gradient_accumulation_steps: 1
model:
  action_horizon: 64
  action_chunk_size: 16
  num_history_frames: 6
  action_video_read_mode: current_only
  chunk_kv_editor_num_queries: 32
  loss:
    lambda_action_prior: 1.0
data:
  train:
    pretrained_norm_stats: dataset_stats.json
  val:
    pretrained_norm_stats: dataset_stats.json

For the complete configuration, see configs/task/robotwin_ahawam.yaml in the AHA-WAM codebase.

Limitations

  • The checkpoint is trained for the RoboTwin2.0 setup used by the AHA-WAM release and expects matching observation/action conventions.
  • dataset_stats.json must be used with the checkpoint for correct action and state normalization.
  • Evaluation results can vary with RoboTwin version, simulator assets, task split, random seed, GPU precision, and local environment setup.
  • The checkpoint should be used together with the AHA-WAM codebase; it is not a standalone Hugging Face Transformers model.

License

The AHA-WAM code release is under the MIT License unless otherwise noted. Please also follow the licenses and usage terms of upstream dependencies and assets, including Wan2.2 and RoboTwin.

Citation

If you use this checkpoint, please cite the AHA-WAM project and any upstream datasets or models used in your work.

@article{cai2026ahawam,
  title={AHA-WAM: Asynchronous Horizon-Adaptive World-Action Modeling with Observation-Guided Context Routing},
  author={Cai, Jisong and Ling, Long and Chu, Shiwei and Liu, Zhongshan and Kang, Jiayue and Liang, Zhixuan and Xu, Wenjie and Mao, Yinan and Zhang, Weinan and Yang, Xiaokang and Ying, Ru and Zheng, Ran and Mu, Yao},
  journal={arXiv preprint arXiv:2606.09811},
  year={2026}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading

Model tree for SereneC/AHA-WAM-RoboTwin2.0

Finetuned
(61)
this model

Paper for SereneC/AHA-WAM-RoboTwin2.0