ACT Dual-Arm Vanilla (Baseline)

This model is a baseline Action Chunking with Transformers (ACT) policy trained for cooperative dual-arm manipulation in PyBullet. It was trained without domain randomization to establish a performance baseline on a single object (cracker_box).

Model Details

  • Architecture: Action Chunking with Transformers (ACT) with a ResNet18 vision backbone.
  • Task: Dual-arm cooperative lifting and placing of an object.
  • Action Space: 12-D absolute joint angles in radians (6 degrees of freedom per arm, gripper state fixed).
  • Vision: 3 camera streams (overhead, left wrist, right wrist) at 224x224 resolution.
  • Training Data: 149 expert demonstrations collected at 20 FPS.

Performance

Achieved a 95% success rate and 100% grip rate on the in-distribution evaluation suite. The model exhibits highly deterministic chunk execution and low positional jitter compared to diffusion-based alternatives.

Downloads last month
36
Safetensors
Model size
51.6M params
Tensor type
F32
·
Video Preview
loading