YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
MemoryVLA Single-Task LoRA (Bridge-Init) — Real-World Tasks
Per-task LoRA fine-tunes of memvla-bridge on real-robot data from
harrywang01/real_world_task.
Tasks
| Task | Instruction |
|---|---|
can |
"Swap the positions of the pink and white soda cans by using another empty location as a buffer." |
wipe_new |
"Brush the two plates with the brushes, and each brush can only be used once." |
uncover_new |
"Lift the cups to find the small cube hidden underneath, and each cup may only be lifted once." |
Recipe
- Base:
memvla-bridge.pt(shihao1895/memvla-bridge). - Action dim 7 (bridge format: Δxyz(3) + Δrpy(3) + gripper_bin(1)),
produced by
scripts/convert_realworld_hf_to_robomimic.py --bridge_format. - LoRA r=32 / α=64 / dropout=0.05 + LLaMA r=16 (q_proj, v_proj) + cog gate.
- Single H100,
per_device_batch_size=64,lr=1e-4,max_steps=4000,save_interval=200,lr_scheduler=linear-warmup+cosine-decay, warmup_ratio=0.05. repeated_diffusion_steps=1(down from upstream default=4 — 4× speedup with no measured loss-curve regression).- DataLoader
num_workers=4, persistent_workers=True, prefetch_factor=2(upstream MemoryVLA hardcodednum_workers=0; patched inKuanchengWang/diffusion_policy@jinglin). - 46.09 M trainable params (8.39 M LLaMA LoRA + 1.57 M CogMemBank cross
- 0.39 M cog gate + 6.29 M DiT-L attention + 20.98 M modules_to_save (timestep encoder) + others).
FSDP save fix (critical)
Each .adapter is a flat state_dict containing only the trainable LoRA +
modules_to_save tensors. The save code (fsdp.py:save_checkpoint):
- strips
_fsdp_wrapped_module.and_checkpoint_wrapped_module.wrapper prefixes (otherwise LLaMA LoRA keys were silently dropped — the earlierlibero-100ckpts had this bug); - keeps the
vlm.prefix so eval-timevla.load_state_dict(adapter)findsvlm.llm_backbone.*matches (otherwise 128 LLaMA-LoRA keys showed up as "unexpected" and LoRA was silently inactive); - hard-asserts every trainable param ends up in the saved dict.
Verified contents per ckpt: 341 tensors, 184 MB,
{llama: 128, cog: 18, action: 155, other: 40}.
Layout
<task>/
checkpoints/
step-XXXXXX-epoch-YY-loss=Z.ZZZZ.adapter # every 200 steps (20 total)
config.yaml # run config
dataset_statistics.json # task-specific action norm stats
Loading
import torch
adapter = torch.load("can/checkpoints/step-004000-epoch-09-loss=0.07.adapter",
map_location="cpu", weights_only=False)
# adapter = {"adapter": OrderedDict[str, Tensor],
# "global_step": int, "epoch": int}
# At eval time (see eval_memoryvla_multitask_rollout.py):
from vla import load_vla
from memory_diffusion_policy.policy.memoryvla_lora import (
MemoryVLALoRAConfig, apply_memoryvla_lora,
)
vla = load_vla("memvla-bridge.pt", load_for_training=False)
lora_cfg = MemoryVLALoRAConfig(**run_cfg["lora"])
apply_memoryvla_lora(vla, lora_cfg)
missing, unexpected = vla.load_state_dict(adapter["adapter"], strict=False)
# Both lists should be small + benign (frozen base keys); 0 unexpected.
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support