ACT Cutlery Arrangement β€” chunk_size=50 (act_local_v2)

Training context

  • Trained: Mon May 25 01:10:00 PM PST 2026
  • Dataset: AI-Final/AI-aiCapstoneData-lerobot-cutlery-v2 (v2)
    • 200 scenes with equitable spatial distribution (vs 76 in v1)
    • 127 successful demos at datagen time (~64% success rate)
  • State machine: corrected (fork β†’ +x of plate, knife β†’ -x). The previous v1 dataset (SchindleriaPraematurus/aiCapstoneData-lerobot-cutlery) was generated with a sign-inverted success criterion β€” policies trained on it learned the opposite of what the leaderboard tests.

Architecture & hyperparameters

Policy ACT (Action Chunking Transformer)
Steps 80,000
Batch size 8
chunk_size 50
Save frequency every 20k steps
Hardware NVIDIA GeForce RTX 3060 (12 GB)

Experiment design β€” chunk_size A/B

Two policies were trained as a clean A/B test. The leaderboard eval calls the policy with --policy_action_horizon=1 (re-plan every step), so we want to know whether shorter or longer training chunks produce better single-step actions.

Sibling repo chunk_size
AI-Final/cutlery-act-v2-chunk100 100 (lerobot default)
AI-Final/cutlery-act-v2-chunk50 50 (shorter chunks)

Eval

7-episode rollout on eval/cutlery_arrangement_eval.py (the task file that mirrors the leaderboard).

  • Result line: [Evaluation] Final success rate: 0.000 [0/7]
  • Video: eval_rollout_7_episodes.mp4 (side-by-side wrist + front cameras)

Known ceiling

Even at perfect imitation, the underlying state machine has:

  • ~33% baseline failure on non-tight-pair scenes (fixed-phase timing / IK)
  • 25 s worst-case completion vs 20 s leaderboard cap
  • 9 cm release height β€” cutlery bounces

So a successful run here demonstrates the imitation, but doesn't break past the FSM's intrinsic limitations.

Downloads last month
30
Safetensors
Model size
51.6M params
Tensor type
F32
Β·
Video Preview
loading