ImageWAM-FLUX.2-4B-LIBERO
This repository contains the ImageWAM FLUX.2 4B checkpoint for LIBERO from ImageWAM: Do World Action Models Really Need Video Generation, or Just Image Editing?
ImageWAM is a family of world action models built on image-editing foundation models. This checkpoint is intended for evaluation and research use with the accompanying ImageWAM codebase.
Model Details
- Model family: ImageWAM
- Image-editing backbone: FLUX.2 [klein] base
- Variant: FLUX.2 klein-base-4B
- Benchmark: LIBERO
- Training code: yuyangalin/ImageWAM
- Base model weights: Users must separately prepare the FLUX.2 klein-base-4B weights and FLUX.2 autoencoder as described in the ImageWAM README.
Files
Expected file layout:
.
βββ model.pt
βββ dataset_stats.json
βββ config.yaml
model.pt: ImageWAM checkpoint used by the evaluation scripts.dataset_stats.json: normalization statistics required for policy evaluation.config.yaml: original training configuration for provenance and reproducibility.
Usage
Install and prepare the ImageWAM repository following the project README. Then download this model repository:
mkdir -p checkpoints/imagewam_release/libero/flux2_klein_4b
huggingface-cli download yuyangalin/ImageWAM-FLUX.2-4B-LIBERO \
--repo-type model \
--local-dir checkpoints/imagewam_release/libero/flux2_klein_4b
Prepare FLUX.2 4B weights and set:
export FLUX2_VARIANT=4b
export FLUX2_MODEL_PATH=/path/to/flux-2-klein-base-4b.safetensors
export FLUX2_AE_MODEL_PATH=/path/to/ae.safetensors
export FLUX2_QWEN3_MODEL_SPEC=Qwen/Qwen3-4B
Evaluate on LIBERO:
export CKPT_PATH="$(pwd)/checkpoints/imagewam_release/libero/flux2_klein_4b/model.pt"
export DATASET_STATS_PATH="$(pwd)/checkpoints/imagewam_release/libero/flux2_klein_4b/dataset_stats.json"
NUM_GPUS=8 FLUX2_VARIANT=4b bash scripts/flux2/run_eval_flux2_libero.sh
Intended Use
This checkpoint is intended for:
- Reproducing ImageWAM LIBERO evaluations.
- Research on robot policy learning, world action models, and image-editing-based action generation.
- Comparison against other LIBERO policy models under the same evaluation setup.
This checkpoint is not intended for safety-critical or real-world robot deployment without additional validation.
Limitations
- Evaluation requires the ImageWAM codebase and the LIBERO benchmark environment.
- The checkpoint assumes the same model variant and configuration used during training. See
train_config.yaml. - Users must separately prepare the matching FLUX.2 4B base model and autoencoder weights.
- Performance may differ if the simulator version, dataset preprocessing, action normalization statistics, or evaluation settings differ from the release setup.
Citation
If you use this checkpoint, please cite the ImageWAM paper:
@misc{zhang2026imagewam,
title={ImageWAM: Do World Action Models Really Need Video Generation, or Just Image Editing?},
author={Yuyang Zhang and Wenyao Zhang and Zekun Qi and He Zhang and Haitao Lin and Jingbo Zhang and Yao Mu and Xiaokang Yang and Wenjun Zeng and Xin Jin},
year={2026},
eprint={2606.19531},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2606.19531},
}
Acknowledgements
ImageWAM builds on several open-source projects and model families, including FLUX.2, FastWAM, LIBERO, LIBERO-plus, and RoboTwin. Please also follow the licenses and citation requirements of the corresponding upstream projects.
- Downloads last month
- 8