MemoryVLA Multi-Task LoRA Adapters โ libero-100 base (Plan B)
LoRA-only fine-tune of shihao1895/memvla-libero-100 (a fully-trained MemoryVLA with DiT-L action expert on LIBERO-100 data) for 7 robosuite_pomdp manipulation tasks.
This is the Plan-B recipe โ starting from a domain-matched MemoryVLA ckpt (libero == robosuite + MuJoCo + Panda + 7-DoF EEF delta + same gripper, 20 Hz) means LoRA only does small-domain adaptation rather than learning action generation from scratch. Replaces the prior Wr3ck1Am/memoryvla-multitask-lora recipe (which started from bare OpenVLA-7B and never got the DiT action heads to converge).
Recipe
- Base:
shihao1895/memvla-libero-100(fully-trained MemoryVLA, 33.5 GB) - LoRA:
- DiT-L attn:
[q, v, out]r=16 ฮฑ=32 - LLaMA:
[q_proj, v_proj]r=8 ฮฑ=16 - SigLIP vision: off
- CogMem cross-attn: off
- CogMem GateFusion: off
- DiT-L attn:
- modules_to_save:
[per_compr]only (robosuite_pomdp BottleneckSE) - Trainable: 8.62 M (0.103 % of 8.4 B)
- Optimizer: AdamW lr 2e-4, constant scheduler, no warmup (matches official memvla recipe)
- Hardware: 4ร A100 80GB, FSDP full-shard, BF16 mixed precision + grad checkpointing
- Global batch: 384 (per-device 96 ร 4 GPU ร accum 1)
- Dataset: 330 k samples ร 7 tasks, balanced sampling, image_aug=True
- Save interval: every 500 steps
Tasks
| Task | Instruction | Image key |
|---|---|---|
| fruit_swap | swap magenta/blue blocks via empty colored region | peg_focus_view_image |
| button_lightbulb | press buttons LโR; turn off non-target; only target lit | agentview_image |
| find_soda | open drawer; if no soda close; else place on orange target | drawerview_image |
| insert_peg | try holes in random order without repetition until peg inserted | peg_focus_view_image |
| lego_stacking | try-stack to find stackable on top, non-stackable on bottom | peg_focus_view_image |
| uncover_block | lift covers without repetition until hidden red block found | agentview_image |
| open_doors | pull doors in random order to find openable one | doorview_image |
Dataset: harrywang01/image-tasks-all + standalone image-findsoda-fixed + image-legostacking-pegfocus.
Files
step-*.adapterโ LoRA-only ckpts (~150 MB each), saved every 500 stepsconfig.yamlโ full training configdataset_statistics.jsonโ per-task action mean/std + min/max
Quick Use (rollout)
import torch, sys
sys.path.insert(0, "third_party/MemoryVLA")
from vla import load_vla
from memory_diffusion_policy.policy.memoryvla_lora import (
MemoryVLALoRAConfig, apply_memoryvla_lora,
)
# 1. Load base
vla = load_vla(
model_id_or_path="<path to memvla-libero-100.pt>",
load_for_training=False,
future_action_window_size=15, action_model_type="DiT-L",
per_token_size=256, mem_length=16, retrieval_layers=2,
fusion_type="gate", consolidate_type="tome",
)
# 2. Apply LoRA wrap (matches this checkpoint's recipe)
cfg = MemoryVLALoRAConfig(
enabled=True, r=16, alpha=32.0, dropout=0.05,
dit_attn_targets=["q", "v", "out"],
lora_llama=True, llama_r=8, llama_alpha=16, llama_targets=["q_proj", "v_proj"],
lora_cog_cross=False, lora_cog_gate=False, lora_vision=False,
modules_to_save_list=["per_compr"],
)
apply_memoryvla_lora(vla, cfg, log=...)
# 3. Load adapter
adapter = torch.load("step-XXXXX-...adapter", map_location="cpu")
vla.load_state_dict(adapter["adapter"], strict=False)
# 4. Inference โ see MemoryVLA repo eval scripts
Get the base ckpt: huggingface-cli download shihao1895/memvla-libero-100.
- Downloads last month
- 4
Model tree for Wr3ck1Am/memoryvla-multitask-lora-libero-base
Base model
shihao1895/memvla-libero-100