PaliGemma-OFT · LIBERO (all 4 suites, joint training)

Vision-Language-Action (VLA) checkpoint released with the AlphaBrain framework. Trained jointly on all four LIBERO suites — Goal, Spatial, Object, and Long — for direct evaluation across the full LIBERO benchmark without retraining.

PaliGemma-OFT-v2 couples a PaliGemma 3B VLM with a DiT-B regression action head (action_dim=7, horizon=8). The v2 recipe uses the bs128-scaled learning-rate schedule and was trained in a single supervised run on a mixed stream of all 4 LIBERO suites via the libero_all data mix. This release is the steps=20 000 checkpoint of a 30 000-step budget run and is the best-performing LIBERO all-4-suite checkpoint in the AlphaBrain PaliGemmaOFT family.

Overview


Architecture	PaliGemmaOFT_v2 (PaliGemma 3B + DiT-B regression head)
Base VLM	`google/paligemma-3b-pt-224`
Action head	DiT-B, `hidden_size=2048`, `action_dim=7`, `state_dim=7`, horizon 8
Training data	LIBERO · all 4 suites (Goal + Spatial + Object + Long) · `dataset_mix=libero_all`
Training type	Supervised fine-tuning (single run; not continual learning)
Attention	flash_attention_2
Optimiser	AdamW · `lr_base = 2.5e-5` · cosine-with-min-lr · 500 warmup
Step budget	20 000 (this release) · out of 30 000 planned
Hardware / batch	4 × A800 80 GB · `per_device_batch_size = 32` · `effective_batch = 128`

Results

Evaluated with this checkpoint on all 4 LIBERO suites, 50 rollouts per task × 10 tasks per suite = 500 episodes per suite.

Suite	Success Rate
LIBERO-Goal	95.8 %
LIBERO-Spatial	95.0 %
LIBERO-Object	97.8 %
LIBERO-10 (Long)	83.2 %
Avg (4-suite)	92.95 %

Files

├── README.md                   model card
├── framework_config.yaml       AlphaBrain framework configuration
├── dataset_statistics.json     action normalisation statistics (required for inference)
├── model.safetensors           full VLA weights (~6.5 GB)
├── resume_meta.json            training metadata (step count, GPU count)
└── paligemma_pretrained/       PaliGemma tokenizer + preprocessor configs

Usage

git clone https://github.com/AlphaBrainGroup/AlphaBrain.git
cd AlphaBrain
pip install -e .

export PRETRAINED_MODELS_DIR=/path/to/models   # must contain paligemma-3b-pt-224/

huggingface-cli download AlphaBrainGroup/paligemma-oft-libero-all4suite \
    --local-dir ./paligemma_oft_libero_all

python deployment/model_server/server_policy.py \
    --ckpt_path ./paligemma_oft_libero_all --port 10093 --use_bf16

For evaluation on any of the 4 LIBERO suites, see the LIBERO eval pipeline.

Reproduction

# Framework's base VLA training entry
bash scripts/run_base_vla/train.sh paligemma_oft_v2_all_30k

Expect multi-day training on 4 × A800 80 GB for the full 30 000-step schedule. The shipped framework_config.yaml is the exact training configuration used for this checkpoint.

Notes

Joint-training baseline, not continual learning. For the CL releases see AlphaBrainGroup/qwengr00t-cl-libero-goal / qwengr00t-cl-lora-libero-goal.
v2 indicates the bs128 lr-scaled recipe (vs earlier bs64 baseline).
LIBERO-10 is the long-horizon suite; single-task SR is lower as expected due to longer demos and multi-stage tasks.

License

MIT — see the parent repository.

Citation

@misc{alphabrain2026,
  title  = {AlphaBrain: A Modular Open-Source Framework for Embodied Intelligence Research},
  author = {AlphaBrain Team},
  year   = {2026},
  url    = {https://github.com/AlphaBrainGroup/AlphaBrain}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Safetensors

Model size

3B params

Tensor type

BF16

Video Preview

Robotics

Model tree for AlphaBrainGroup/paligemma-oft-libero-all4suite

Base model

google/paligemma-3b-pt-224

Finetuned

(78)

this model