ActQuant โ Pi 0.5 LIBERO โ 3 BPW
Quantized Pi 0.5 LIBERO-finetuned checkpoint produced with the two-stage ActQuant quantization recipe.
- Stage 1 โ HSIC inter-tensor bit allocation. Per-tensor sensitivity scored with the Hilbert-Schmidt Independence Criterion against ground-truth actions; greedy per-layer Lยฒ allocator assigns a quant type per tensor under a budget.
- Stage 2 โ Action-Mixed Fisher (AMF) imatrix. Per-element Fisher diagonal
under the flow-matching action loss, consumed by
llama-quantizefor block-level scale optimization.
This variant uses Q2_K as the LLM allocator's base type (HSIC sens score),
Q3_K vision tower, and the action expert kept at fp16.
Bits-per-weight breakdown
| Section | Params | BPW |
|---|---|---|
| Vision (SigLIP, Q3_K) | 449 M | 3.56 |
| LLM blocks (HSIC-allocated Q2_K base, selective Q4_K upgrades) | 1.98 B | 2.98 |
| Vision + LLM blocks (headline) | 2.43 B | 3.09 BPW |
| Token embedding (Q8_0) | 527 M | 8.50 |
| Action expert / flow head (fp16) | 430 M | 16.00 |
| Full checkpoint footprint | 2.2 GB |
The 3 BPW headline number excludes the embedding and action-expert tensors, following standard quantization-paper convention: those tensors are not the target of ActQuant's block-wise allocation.
LIBERO closed-loop results
Aggregate success rate across all four LIBERO suites, 500 trials per suite
(2 000 total), evaluated through the C++/GGML runtime via the pi05.so
pybind11 binding (same code path that runs at deployment):
| Suite | Success rate |
|---|---|
libero_spatial |
98.2 % |
libero_object |
98.8 % |
libero_goal |
95.0 % |
libero_10 (long horizon) |
87.2 % |
| Aggregate | 94.8 % |
For reference, llama.cpp's blind round-to-nearest at the same Q2_K base scores 94.3 % aggregate; the two-stage ActQuant pipeline preserves 0.5 pp aggregate success while delivering paper-grade per-tensor bit allocation. Gains over the blind heuristic grow at sub-3-bit budgets.
Files
| File | Size | Purpose |
|---|---|---|
pi05.gguf |
2.2 GB | Merged vision + LLM + action-expert GGUF (the deployable artifact) |
tokenizer.model |
4.1 MB | PaliGemma SentencePiece tokenizer |
norm_stats.json |
1.7 KB | LIBERO action-quantile normalization stats |
Run LIBERO evaluation
huggingface-cli download NU-World-Model-Embodied-AI/ActQuant-Pi05-LIBERO-3bpw \
--local-dir /path/to/eval_dir
# From the ActQuant build tree (build_openpi/) โ see paper repo:
cd /path/to/ActQuant
for suite in libero_spatial libero_object libero_goal libero_10; do
bash tools/pi0.5/run_libero_eval.sh "$suite" 50 5 8 8000 /path/to/eval_dir
done
The launcher spawns one serve_policy.py per GPU (loading pi05.gguf
through the pi05.so binding) and the LIBERO client talks to it over
WebSocket.
Reproduce this exact checkpoint from scratch
The complete pipeline is in github.com/arashakb/ActQuant.
After building (build_openpi/) and reconstructing the public LIBERO
calibration set:
# Stage 0 โ Export the LIBERO-finetuned bf16 Pi 0.5 to GGUF (vision Q3_K,
# everything else preserved). This produces the merge target.
python tools/pi0.5/export_pi05.py \
-d /path/to/pi05_libero_finetuned_v044 \
-o /path/to/pi05_libero_base_gguf \
--quant_vision q3k
# Stage 0b โ Export the standalone bf16 PaliGemma LLM (this is what
# llama-quantize will quantize in Stage 1+2).
python tools/pi0.5/export_pi05_llm.py \
-d /path/to/pi05_libero_finetuned_v044 \
-o /path/to/pi05_libero_base_gguf/pali_llm_bf16.gguf
# Stage 2 โ Compute the AMF Fisher imatrix once
# (60-episode LIBERO calibration: 10 spatial / 10 object / 10 goal / 30 long;
# LIBERO is public โ reconstruct via get_pi05_calib_data.py).
python tools/fisher-diag/get_pi05_calib_data.py \
--output-dir /path/to/calib_data_raw
python tools/fisher-diag/compute_fisher_pi05.py \
--checkpoint /path/to/pi05_libero_finetuned_v044 \
--calib-dir /path/to/calib_data_raw \
--output /path/to/pi05_libero_base_gguf/fisher_flow_perweight.gguf \
--num-gpus 8 --batch-size 6
# Stage 1 + 2 + merge โ chained by run_hsic_quant_pi05.sh
bash tools/hsic/run_hsic_quant_pi05.sh \
--base-type Q2_K \
--max-type Q4_K \
--score-key sens \
--num-gpus 8
Output: pi05_q2k_hsic_sens_v3k.gguf (rename to pi05.gguf for serving).
This is the exact recipe (--base-type Q2_K --score-key sens against a
Q3_K-vision base, AMF imatrix, default merge) that produced the
checkpoint in this repository.
Citation
@article{actquant2026,
title = {ActQuant: Sub-4-bit Action-Guided Quantization for Vision-Language-Action Models},
author = {Akbari, Arash and others},
journal= {arXiv preprint arXiv:2605.24011},
year = {2026}
}
License
MIT (inherited from the llama.cpp upstream build infrastructure). Pi 0.5 model weights are bound by the upstream OpenPI license and the underlying PaliGemma terms.
- Downloads last month
- 19
We're not able to determine the quantization variants.
Model tree for NU-World-Model-Embodied-AI/ActQuant-Pi05-LIBERO-3bpw
Base model
lerobot/pi05_libero_finetuned_v044