Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,51 @@
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
---
|
4 |
+
|
5 |
+
This repo deploys a pix2struct model to [Inference Endpoints](https://ui.endpoints.huggingface.co/) to a single GPU. This is not meant for multiple GPUs (single gpu replicas are ok; don't use the 4xT4 option, though) or CPU.
|
6 |
+
|
7 |
+
Options:
|
8 |
+
- modify the [model_name](https://huggingface.co/nbroad/p2s-infographic-lg-endpt/blob/main/handler.py#L19) with whichever model you want to use
|
9 |
+
- if it is a private model, just add handler.py to that model repo and changed `model_name` to `"./"`
|
10 |
+
- modify the [dtype](https://huggingface.co/nbroad/p2s-infographic-lg-endpt/blob/main/handler.py#L27) to whichever one you want
|
11 |
+
- [see notes here](https://huggingface.co/nbroad/p2s-infographic-lg-endpt/blob/main/handler.py#L21-L26) for dtype tradeoffs
|
12 |
+
|
13 |
+
|
14 |
+
|
15 |
+
|
16 |
+
After deploying the model, inference can be done as follows:
|
17 |
+
|
18 |
+
|
19 |
+
```python
|
20 |
+
import base64
|
21 |
+
|
22 |
+
with open("path/to/image", "rb") as f:
|
23 |
+
b64 = base64.b64encode(f.read())
|
24 |
+
|
25 |
+
question = "question to model"
|
26 |
+
|
27 |
+
payload = {
|
28 |
+
"inputs": {
|
29 |
+
"image": [b64.decode("utf-8")], # for batched inference, send list of images/questions
|
30 |
+
"question": [question]
|
31 |
+
},
|
32 |
+
"parameters":{
|
33 |
+
"max_new_tokens": 10, # can use any generation parameters
|
34 |
+
}}
|
35 |
+
|
36 |
+
import requests
|
37 |
+
|
38 |
+
API_URL = "url_to_endpoint"
|
39 |
+
headers = {
|
40 |
+
"Authorization": "Bearer HF_TOKEN",
|
41 |
+
"Content-Type": "application/json"
|
42 |
+
}
|
43 |
+
|
44 |
+
def query(payload):
|
45 |
+
response = requests.post(API_URL, headers=headers, json=payload)
|
46 |
+
return response.json()
|
47 |
+
|
48 |
+
output = query(payload)
|
49 |
+
|
50 |
+
# {'output': ['55%']}
|
51 |
+
```
|