ActQuant — Pi 0.5 LIBERO — 3 BPW

Quantized Pi 0.5 LIBERO-finetuned checkpoint produced with the two-stage ActQuant quantization recipe.

Stage 1 — HSIC inter-tensor bit allocation. Per-tensor sensitivity scored with the Hilbert-Schmidt Independence Criterion against ground-truth actions; greedy per-layer L² allocator assigns a quant type per tensor under a budget.
Stage 2 — Action-Mixed Fisher (AMF) imatrix. Per-element Fisher diagonal under the flow-matching action loss, consumed by llama-quantize for block-level scale optimization.

This variant uses Q2_K as the LLM allocator's base type (HSIC sens score), Q3_K vision tower, and the action expert kept at fp16.

Bits-per-weight breakdown

Section	Params	BPW
Vision (SigLIP, Q3_K)	449 M	3.56
LLM blocks (HSIC-allocated Q2_K base, selective Q4_K upgrades)	1.98 B	2.98
Vision + LLM blocks (headline)	2.43 B	3.09 BPW
Token embedding (Q8_0)	527 M	8.50
Action expert / flow head (fp16)	430 M	16.00
Full checkpoint footprint		2.2 GB

The 3 BPW headline number excludes the embedding and action-expert tensors, following standard quantization-paper convention: those tensors are not the target of ActQuant's block-wise allocation.

LIBERO closed-loop results

Aggregate success rate across all four LIBERO suites, 500 trials per suite (2 000 total), evaluated through the C++/GGML runtime via the pi05.so pybind11 binding (same code path that runs at deployment):

Suite	Success rate
`libero_spatial`	98.2 %
`libero_object`	98.8 %
`libero_goal`	95.0 %
`libero_10` (long horizon)	87.2 %
Aggregate	94.8 %

For reference, llama.cpp's blind round-to-nearest at the same Q2_K base scores 94.3 % aggregate; the two-stage ActQuant pipeline preserves 0.5 pp aggregate success while delivering paper-grade per-tensor bit allocation. Gains over the blind heuristic grow at sub-3-bit budgets.

Files

File	Size	Purpose
`pi05.gguf`	2.2 GB	Merged vision + LLM + action-expert GGUF (the deployable artifact)
`tokenizer.model`	4.1 MB	PaliGemma SentencePiece tokenizer
`norm_stats.json`	1.7 KB	LIBERO action-quantile normalization stats

Run LIBERO evaluation

huggingface-cli download NU-World-Model-Embodied-AI/ActQuant-Pi05-LIBERO-3bpw \
    --local-dir /path/to/eval_dir

# From the ActQuant build tree (build_openpi/) — see paper repo:
cd /path/to/ActQuant
for suite in libero_spatial libero_object libero_goal libero_10; do
    bash tools/pi0.5/run_libero_eval.sh "$suite" 50 5 8 8000 /path/to/eval_dir
done

The launcher spawns one serve_policy.py per GPU (loading pi05.gguf through the pi05.so binding) and the LIBERO client talks to it over WebSocket.

Reproduce this exact checkpoint from scratch

The complete pipeline is in github.com/arashakb/ActQuant. After building (build_openpi/) and reconstructing the public LIBERO calibration set:

# Stage 0 — Export the LIBERO-finetuned bf16 Pi 0.5 to GGUF (vision Q3_K,
# everything else preserved). This produces the merge target.
python tools/pi0.5/export_pi05.py \
    -d /path/to/pi05_libero_finetuned_v044 \
    -o /path/to/pi05_libero_base_gguf \
    --quant_vision q3k

# Stage 0b — Export the standalone bf16 PaliGemma LLM (this is what
# llama-quantize will quantize in Stage 1+2).
python tools/pi0.5/export_pi05_llm.py \
    -d /path/to/pi05_libero_finetuned_v044 \
    -o /path/to/pi05_libero_base_gguf/pali_llm_bf16.gguf

# Stage 2 — Compute the AMF Fisher imatrix once
#   (60-episode LIBERO calibration: 10 spatial / 10 object / 10 goal / 30 long;
#    LIBERO is public — reconstruct via get_pi05_calib_data.py).
python tools/fisher-diag/get_pi05_calib_data.py \
    --output-dir /path/to/calib_data_raw
python tools/fisher-diag/compute_fisher_pi05.py \
    --checkpoint /path/to/pi05_libero_finetuned_v044 \
    --calib-dir  /path/to/calib_data_raw \
    --output     /path/to/pi05_libero_base_gguf/fisher_flow_perweight.gguf \
    --num-gpus 8 --batch-size 6

# Stage 1 + 2 + merge — chained by run_hsic_quant_pi05.sh
bash tools/hsic/run_hsic_quant_pi05.sh \
    --base-type Q2_K \
    --max-type  Q4_K \
    --score-key sens \
    --num-gpus  8

Output: pi05_q2k_hsic_sens_v3k.gguf (rename to pi05.gguf for serving).

This is the exact recipe (--base-type Q2_K --score-key sens against a Q3_K-vision base, AMF imatrix, default merge) that produced the checkpoint in this repository.

Citation

@article{actquant2026,
  title  = {ActQuant: Sub-4-bit Action-Guided Quantization for Vision-Language-Action Models},
  author = {Akbari, Arash and others},
  journal= {arXiv preprint arXiv:2605.24011},
  year   = {2026}
}

License

MIT (inherited from the llama.cpp upstream build infrastructure). Pi 0.5 model weights are bound by the upstream OpenPI license and the underlying PaliGemma terms.