BehaPi · π0.5 LIBERO Ablation Checkpoints

5 × 30k-step LoRA-free fine-tunes of pi05_base on LIBERO (4 task suite, 10 tasks each), trained on 8×A800 with FSDP. Companion checkpoints for the ablation study in github.com/Xuewei-Huang/BehaPi.

Results (each config: 4 suite × 50 trial × 10 task = 2000 evaluation episodes)

Config Path Spatial Object Goal LIB10 Mean Δ vs BL
Baseline (vanilla pi0.5) libero_baseline_a800_train/30000/ 91.8 94.6 90.8 86.2 90.85
T1: Per-timestamp Normalization t1_per_ts_norm_a800_train/30000/ 94.0 96.0 92.4 88.2 92.65 +1.80
T2: Correlated FM + K=8 Multi-sample t2_corr_multi_a800_train/30000/ 91.4 97.0 92.0 87.8 92.05 +1.20
T3: KV Transform t3_kv_transform_a800_train/30000/ 90.4 97.8 92.6 85.0 91.45 +0.60
Combined (T1+T2+T3) combined_all_a800_train/30000/ 93.0 96.0 92.6 87.8 92.35 +1.50

Full per-suite analysis + debug stories in the code repo's RESULTS.md.

Key findings

  1. T1 (per-timestamp normalization) is the single biggest contributor (+1.80pp). Smallest code change, biggest impact.
  2. Combined is sub-additive (+1.50pp) — less than T1 alone.
  3. T3's KV mixing matrix is still ≈ identity at 30k steps (mean ≈ 1/18, Frobenius-from-identity ≈ 0). The trick is under-trained, not ineffective.

Download a single ckpt

from huggingface_hub import snapshot_download
snapshot_download(
    repo_id="Xuewei-Huang/BehaPi-ckpts",
    allow_patterns="t1_per_ts_norm_a800_train/*",
    local_dir="./behapi_ckpts",
)

Load with openpi

from openpi.policies import policy_config as pc
from openpi.training import config as cfg
train_cfg = cfg.get_config("t1_per_ts_norm_a800_train")
policy = pc.create_trained_policy(train_cfg, "./behapi_ckpts/t1_per_ts_norm_a800_train/30000")
# policy.infer({"observation/image": ..., "observation/wrist_image": ..., "observation/state": ..., "prompt": ...})

You'll need the matching code from the corresponding branch:

  • t1_per_ts_norm_a800_train → branch dev/trick/per-ts-norm
  • t2_corr_multi_a800_train → branch dev/trick/correlated-and-multi
  • t3_kv_transform_a800_train → branch dev/trick/kv-transform
  • combined_all_a800_train → branch dev/trick/combined-all
  • libero_baseline_a800_train → branch dev/port-b1k-tricks

Provenance

  • Base: gs://openpi-assets/checkpoints/pi05_base/params (Physical Intelligence)
  • Fine-tuned on: LIBERO 4 suites (xuewei-huang/libero HF dataset)
  • Training: 8×A800 (HKUST-GZ HPC), FSDP, 30k step, batch 256 / 128 (×K=8 for T2)
  • Code: github.com/Xuewei-Huang/BehaPi

Citations

  • Larchenko et al. 2025, BEHAVIOR-1K Challenge 2025 1st place solution (arXiv:2512.06951) — source of the 3 tricks
  • Physical Intelligence, π0.5 — base model
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for xuewei-huang/BehaPi-ckpts