Robotics
File size: 1,598 Bytes
bd03cdc
 
 
 
 
 
 
f6564ae
 
 
 
 
91f5c18
f6564ae
 
 
91f5c18
f6564ae
 
bd03cdc
f6564ae
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
91f5c18
 
 
 
 
 
f6564ae
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
---
task_categories:
- robotics
datasets:
- USC-GVL/Humanoid-X
pipeline_tag: robotics
---

<div align="center">
<h1> <img src="assets/icon.png" width="50" /> UH-1 </h1>
</div>
<h5 align="center">
    <a href="https://usc-gvl.github.io/UH-1/">🌐 Homepage</a> | <a href="https://huggingface.co/datasets/USC-GVL/Humanoid-X">⛁ Dataset</a> | <a href="https://huggingface.co/USC-GVL/UH-1">πŸ€— Models</a> | <a href="https://arxiv.org/abs/2412.14172">πŸ“‘ Paper</a> | <a href="https://github.com/sihengz02/UH-1">πŸ’» Code</a>
</h5>


This repo contains the officail model checkpoints for the paper "[Learning from Massive Human Videos for Universal Humanoid Pose Control](https://arxiv.org/abs/2412.14172)"
If you like our project, please give us a star ⭐ on GitHub for latest update.

Our model checkpoints contain a transformer model `UH1_Transformer.pth` and an action tokenizer `UH1_Action_Tokenizer.pth`. For the usage/inference of these models, please refer to the [codes](https://github.com/sihengz02/UH-1) here.

![Alt text](assets/teaser.png)

# UH-1 Model Architecture

![Alt text](assets/model.png)

# UH-1 Real Robot Demo Results

![Alt text](assets/realbot.png)

# Citation

If you find our work helpful, please cite us:

```bibtex
@article{mao2024learning,
  title={Learning from Massive Human Videos for Universal Humanoid Pose Control},
  author={Mao, Jiageng and Zhao, Siheng and Song, Siqi and Shi, Tianheng and Ye, Junjie and Zhang, Mingtong and Geng, Haoran and Malik, Jitendra and Guizilini, Vitor and Wang, Yue},
  journal={arXiv preprint arXiv:2412.14172},
  year={2024}
}
```