Update README.md
Browse files
README.md
CHANGED
@@ -5,11 +5,15 @@ tags:
|
|
5 |
- vision
|
6 |
---
|
7 |
|
8 |
-
# PTA-1
|
9 |
|
10 |
-
|
|
|
|
|
11 |
|
|
|
12 |
|
|
|
13 |
|
14 |
## Model Details
|
15 |
|
@@ -201,15 +205,3 @@ print(parsed_answer)
|
|
201 |
<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
|
202 |
|
203 |
[More Information Needed]
|
204 |
-
|
205 |
-
## More Information [optional]
|
206 |
-
|
207 |
-
[More Information Needed]
|
208 |
-
|
209 |
-
## Model Card Authors [optional]
|
210 |
-
|
211 |
-
[More Information Needed]
|
212 |
-
|
213 |
-
## Model Card Contact
|
214 |
-
|
215 |
-
[More Information Needed]
|
|
|
5 |
- vision
|
6 |
---
|
7 |
|
8 |
+
# PTA-1: Controlling Computers with Small Models
|
9 |
|
10 |
+
PTA (Prompt-to-Action) is a vision language model for computer use applications based on Florence-2.
|
11 |
+
With less than 300M parameters it beats larger models in GUI text and element localization.
|
12 |
+
This allows low latency computer automations with local execution.
|
13 |
|
14 |
+
**Model Input:** Screenshot + description_of_target_element
|
15 |
|
16 |
+
**Model Output:** BoundingBox for Target Element
|
17 |
|
18 |
## Model Details
|
19 |
|
|
|
205 |
<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
|
206 |
|
207 |
[More Information Needed]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|