Instructions to use arrow-hf/smolvla-robotwin-place-object-basket-50ep-multi with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- LeRobot
How to use arrow-hf/smolvla-robotwin-place-object-basket-50ep-multi with LeRobot:
# See https://github.com/huggingface/lerobot?tab=readme-ov-file#installation for more details git clone https://github.com/huggingface/lerobot.git cd lerobot pip install -e .[smolvla]
# Launch finetuning on your dataset python lerobot/scripts/train.py \ --policy.path=arrow-hf/smolvla-robotwin-place-object-basket-50ep-multi \ --dataset.repo_id=lerobot/svla_so101_pickplace \ --batch_size=64 \ --steps=20000 \ --output_dir=outputs/train/my_smolvla \ --job_name=my_smolvla_training \ --policy.device=cuda \ --wandb.enable=true
# Run the policy using the record function python -m lerobot.record \ --robot.type=so101_follower \ --robot.port=/dev/ttyACM0 \ # <- Use your port --robot.id=my_blue_follower_arm \ # <- Use your robot id --robot.cameras="{ front: {type: opencv, index_or_path: 8, width: 640, height: 480, fps: 30}}" \ # <- Use your cameras --dataset.single_task="Grasp a lego block and put it in the bin." \ # <- Use the same task description you used in your dataset recording --dataset.repo_id=HF_USER/dataset_name \ # <- This will be the dataset name on HF Hub --dataset.episode_time_s=50 \ --dataset.num_episodes=10 \ --policy.path=arrow-hf/smolvla-robotwin-place-object-basket-50ep-multi - Notebooks
- Google Colab
- Kaggle
SmolVLA RoboTwin place_object_basket (50 ep, MULTI instruction)
SmolVLA policy fine-tuned on 50 demonstration episodes of the place_object_basket task from RoboTwin 2.0 (demo_clean config), built on the SmolVLA-RoboTwin pretrained base (lerobot/smolvla_robotwin).
See also the single-instruction counterpart: arrow-hf/smolvla-robotwin-place-object-basket-50ep
Task & Training
- Robot: Agilex dual-arm, end-effector control (16D state, 16D action)
- Cameras: 3 RGB streams (240×320, D435)
- Instruction mode: per-episode random instruction from RoboTwin's 100 variations (seed=42)
- Training: bs=32, 6000 steps (~10-25 epochs), AdamW lr=1e-4, cosine warmup=300/decay=6000
- Chunk size: 50
Evaluation
RoboTwin 2.0 sim (demo_clean), 10 episodes, max_steps=400, action_chunk_exec=50, eval instruction "place the object in the basket".
Success rate: 5/10 (50%)
Surprising finding: This is the only task in our 8-task benchmark where multi-instruction training outperforms single-instruction (50% vs 30%, +20pp). Likely because the task involves multiple distractor objects, so the multi version is forced to learn instruction-grounded visual recognition, while single trained on the generic "place the object in the basket" cannot ground.
Usage
from lerobot.policies.smolvla import SmolVLAPolicy
policy = SmolVLAPolicy.from_pretrained("arrow-hf/smolvla-robotwin-place-object-basket-50ep-multi")
At inference, use action_chunk_exec=50 (full chunk).
- Downloads last month
- 18