Configuration Parsing Warning:In adapter_config.json: "peft.task_type" must be a string

IsaacLab-smolVLA-SO101-Multitask-8epoch

lerobot/smolvla_base 를 IsaacLab 시뮬레이션 SO101 11-task 데이터셋 CoRL2026-CSI/Isaaclab-so101_11task_baseCaP_3300epi_10fps 으로 8 epoch 파인튜닝한 SmolVLA 정책.

이 체크포인트는 LoRA adapter 입니다 (adapter_model.safetensors). base 모델 lerobot/smolvla_base 와 함께 로드됩니다.

Model details

Base model: lerobot/smolvla_base (SmolVLM2-500M-Video-Instruct VLM + action expert)
Robot: SO101 (6-DOF, gripper 포함) — IsaacLab 시뮬레이션
Cameras: top, left_wrist (480×640) — 정책 키 camera1(left_wrist) / camera2(top) 로 rename
Inputs: observation.state[6] + 카메라 2개 + language instruction (task)
Output: action[6] (joint position)
Action chunking: chunk_size=50, n_action_steps=50

Fine-tuning strategy (PEFT / LoRA)

핵심: action expert 와 projection 레이어는 full fine-tune, VLM backbone 은 q/v_proj 에만 LoRA, 그 외 VLM 은 완전 freeze.

Trainable / Frozen breakdown

모듈	상태	설명
VLM `q_proj`, `v_proj` (attention query/value projection)	🔵 LoRA 학습	base weight 는 frozen, 저랭크 adapter(A·B)만 학습
VLM 그 외 전부 — `k_proj`, `o_proj`, MLP(`gate/up/down_proj`), token/position embeddings, vision encoder(SigLIP), LayerNorm	❄️ 완전 Frozen	LoRA 도 안 붙고 full 학습도 아님
Action expert (`lm_expert`) 전체 — attention(q/k/v/o_proj), MLP(gate/up/down_proj), LayerNorm	🔥 Full fine-tune	전체 weight 직접 학습
`state_proj` (state → token embedding)	🔥 Full fine-tune
`action_in_proj`, `action_out_proj` (action ↔ expert hidden)	🔥 Full fine-tune
`action_time_mlp_in`, `action_time_mlp_out` (flow-matching time embedding)	🔥 Full fine-tune

즉 frozen 인 것은 VLM backbone 의 대부분(vision encoder 포함) + VLM 의 k_proj/o_proj/MLP/embedding/LayerNorm. 학습되는 것은 VLM q/v_proj 의 LoRA adapter + action expert 전체 + 모든 projection 레이어.

LoRA / PEFT config

항목	값
PEFT method	`LORA`
rank `r`	32
`lora_alpha`	8
`lora_dropout`	0.0
`bias`	none
`use_rslora` / `use_dora`	false / false
`target_modules` (LoRA 적용)	`.vlm_with_expert\.vlm\..(q_proj\|v_proj)`
`modules_to_save` (full fine-tune)	`lm_expert`, `state_proj`, `action_in_proj`, `action_out_proj`, `action_time_mlp_in`, `action_time_mlp_out`

저장된 adapter 텐서: 267개 (LoRA A/B 112개 — VLM q_proj·v_proj / full-trained 155개 — expert·projection).

Training hyperparameters

항목	값
Dataset	Isaaclab-so101_11task_baseCaP_3300epi_10fps — 3,300 episodes / 1,175,352 frames / 11 tasks / 10 fps
Epochs	8
Steps	36,800
Global batch size	256 (micro batch 64 × 4 GPU × grad_accum 1)
Optimizer	AdamW — lr `1e-4`, weight_decay `1e-10`, grad_clip_norm `10.0`
LR scheduler	cosine_decay_with_warmup — warmup 1,000 / decay 30,000 / peak_lr `1e-4` / decay_lr `2.5e-6`
Seed	1000
Dataloader workers	24
Mixed precision	no (bf16 inference)
Image augmentation	ColorJitter (brightness/contrast/saturation/hue) + SharpnessJitter, max 3 random — 기하학적 변형(회전/이동/반전) 없음 (VLA 좌우 의미 보존)
Hardware	4 × NVIDIA H100 80GB
Training time	약 11시간 12분
Final loss	0.016 (grad_norm 0.21)

Camera rename

Dataset key	Policy key
`observation.images.left_wrist`	`observation.images.camera1`
`observation.images.top`	`observation.images.camera2`

Usage

from lerobot.policies.smolvla.modeling_smolvla import SmolVLAPolicy

policy = SmolVLAPolicy.from_pretrained("CoRL2026-CSI/IsaacLab-smolVLA-SO101-Multitask-8epoch")

Citation / Acknowledgement

Built on top of LeRobot and the SmolVLA base checkpoint. Project: CoRL 2026 CSI submission.

Framework versions

PEFT 0.19.1
LeRobot 0.5.2

Downloads last month: 3

Video Preview

Robotics

Model tree for CoRL2026-CSI/IsaacLab-smolVLA-SO101-Multitask-8epoch_LoRA

Base model

lerobot/smolvla_base

Adapter

(6)

this model

CoRL2026-CSI
/

IsaacLab-smolVLA-SO101-Multitask-8epoch_LoRA