ACT Dual-Arm Vanilla (Baseline)

This model is a baseline Action Chunking with Transformers (ACT) policy trained for cooperative dual-arm manipulation in PyBullet. It was trained without domain randomization to establish a performance baseline on a single object (cracker_box).

Model Details

Architecture: Action Chunking with Transformers (ACT) with a ResNet18 vision backbone.
Task: Dual-arm cooperative lifting and placing of an object.
Action Space: 12-D absolute joint angles in radians (6 degrees of freedom per arm, gripper state fixed).
Vision: 3 camera streams (overhead, left wrist, right wrist) at 224x224 resolution.
Training Data: 149 expert demonstrations collected at 20 FPS.

Performance

Achieved a 95% success rate and 100% grip rate on the in-distribution evaluation suite. The model exhibits highly deterministic chunk execution and low positional jitter compared to diffusion-based alternatives.

Downloads last month: 36

Safetensors

Model size

51.6M params

Tensor type

F32

Video Preview

Robotics