Instructions to use JHeisler/aloha_solo_left_act_removal_reversed_40k with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- LeRobot
How to use JHeisler/aloha_solo_left_act_removal_reversed_40k with LeRobot:
- Notebooks
- Google Colab
- Kaggle
ACT β ALOHA Single-Arm (Left) β Mask REMOVAL via Reversed Data β 40k steps
Action Chunking Transformer (ACT) policy for mask removal trained on a synthetic dataset derived by time-reversing the placement dataset. Each placement episode reversed becomes a removal episode (gripper opens β closes, mask on face β in arm).
This is the 40k-step retrain (S006), matching S003's step count for direct architectural comparison vs the shipped placement baseline. The 13.4k baseline lives at JHeisler/aloha_solo_left_act_removal_reversed_13k.
Training Config
| Field | Value |
|---|---|
| Architecture | ACT (ResNet18 backbone + 4-layer Transformer encoder + VAE chunking head) |
| Dataset | JHeisler/aloha_solo_left_4_6_26_reversed β 50 ep, 29,735 samples, 30 fps, time-reversed with 1-step action shift |
| State / action dim | 9 / 9 |
| Cameras | cam_high, cam_left_wrist (3Γ480Γ640 each) |
| Steps | 40,000 |
| Batch size | 48 |
| Learning rate | 6e-5 (linear warmup 500 β cosine) |
| Total samples seen | |
| AMP | enabled |
| torch.compile | enabled |
| Save freq | every 10,000 steps (10k / 20k / 30k / 40k checkpoints) |
| Final loss | 0.016β0.020 |
| Final grad norm | 0.23β0.32 |
| Wall clock | ~6h 10min on RTX A4500 (matches placement S003's ~6h 7min) |
| LeRobot pin | 96c7052777aca85d4e55dfba8f81586103ba8f61 |
Project Lineage
| Workstream | Task | Steps | Final loss | HF |
|---|---|---|---|---|
| S001 | placement | 13,400 | 0.029 | act_left |
| S005 | removal (reversed) | 13,400 | 0.035 | act_removal_reversed_13k |
| S003 | placement (shipped) | 40,000 | 0.015 | act_left_40k |
| S006 | removal (reversed) | 40,000 | 0.018 | this repo |
S003 vs S006 is the direct architectural comparison β same arch, same step count, placement dataset vs reversed-placement dataset. Final losses differ by only 3 milliloss (0.015 vs 0.018), suggesting the reversed-data policy converges to a similar quality as the forward-data policy on the per-timestep imitation objective. Real verdict requires offline action-L1 eval on held-out data or robot rollout.
Caveats
- Synthetic data. Trained on time-reversed placement, not native removal. A policy trained on real removal data will likely outperform.
- Visual transitions are physically backwards (mask materializes on face). Doesn't affect ACT's per-timestep predictions (n_obs_steps=1, no temporal context input).
- Use as a lower-bound baseline until native removal data is available.
Usage
from lerobot.common.policies.act.modeling_act import ACTPolicy
policy = ACTPolicy.from_pretrained("JHeisler/aloha_solo_left_act_removal_reversed_40k")
Citation / Course
EN.525.681 school project β JHU Whiting School of Engineering. Team: Jake Heisler, Laura Kroening, Purushottam Shukla.
Code reference: HuggingFace LeRobot at commit 96c7052.
- Downloads last month
- 30