Improve model card metadata: add pipeline tag, license, and paper link

#1
by nielsr HF staff - opened
Files changed (1) hide show
  1. README.md +4 -3
README.md CHANGED
@@ -1,6 +1,8 @@
1
  ---
2
  library_name: transformers
3
  tags: []
 
 
4
  ---
5
 
6
  # Poseless-3B
@@ -9,7 +11,7 @@ tags: []
9
 
10
  ## Introduction
11
 
12
- **"PoseLess: Depth-Free Vision-to-Joint Control via Direct Image Mapping with VLM"** introduces a novel framework for robot hand control that eliminates the need for explicit pose estimation by directly mapping 2D images to joint angles using projected representations. Our approach leverages synthetic training data generated through randomized joint configurations, enabling zero-shot generalization to real-world scenarios and cross-morphology transfer from robotic to human hands. By projecting visual inputs and employing a transformer-based decoder, PoseLess achieves robust, low-latency control while addressing challenges such as depth ambiguity and data scarcity. Experimental results demonstrate competitive performance in joint angle prediction accuracy without relying on any human-labelled dataset
13
 
14
  Our key contributions are as follows:
15
 
@@ -100,5 +102,4 @@ The output will be joint angles in radians in XML format:
100
  - arxiv.org/abs/2503.07111
101
 
102
  ## More Information
103
- * Contact the authors at alan@menlo.ai, bach@menlo.ai, charles@menlo.ai, yuuki@menlo.ai for further details.
104
-
 
1
  ---
2
  library_name: transformers
3
  tags: []
4
+ pipeline_tag: robotics
5
+ license: apache-2.0
6
  ---
7
 
8
  # Poseless-3B
 
11
 
12
  ## Introduction
13
 
14
+ **"PoseLess: Depth-Free Vision-to-Joint Control via Direct Image Mapping with VLM"** ([Paper](https://huggingface.co/papers/2503.07111)) introduces a novel framework for robot hand control that eliminates the need for explicit pose estimation by directly mapping 2D images to joint angles using projected representations. Our approach leverages synthetic training data generated through randomized joint configurations, enabling zero-shot generalization to real-world scenarios and cross-morphology transfer from robotic to human hands. By projecting visual inputs and employing a transformer-based decoder, PoseLess achieves robust, low-latency control while addressing challenges such as depth ambiguity and data scarcity. Experimental results demonstrate competitive performance in joint angle prediction accuracy without relying on any human-labelled dataset
15
 
16
  Our key contributions are as follows:
17
 
 
102
  - arxiv.org/abs/2503.07111
103
 
104
  ## More Information
105
+ * Contact the authors at alan@menlo.ai, bach@menlo.ai, charles@menlo.ai, yuuki@menlo.ai for further details.