nbroad HF staff commited on
Commit
cc16e63
1 Parent(s): 73353d8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +48 -0
README.md CHANGED
@@ -1,3 +1,51 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+
5
+ This repo deploys a pix2struct model to [Inference Endpoints](https://ui.endpoints.huggingface.co/) to a single GPU. This is not meant for multiple GPUs (single gpu replicas are ok; don't use the 4xT4 option, though) or CPU.
6
+
7
+ Options:
8
+ - modify the [model_name](https://huggingface.co/nbroad/p2s-infographic-lg-endpt/blob/main/handler.py#L19) with whichever model you want to use
9
+ - if it is a private model, just add handler.py to that model repo and changed `model_name` to `"./"`
10
+ - modify the [dtype](https://huggingface.co/nbroad/p2s-infographic-lg-endpt/blob/main/handler.py#L27) to whichever one you want
11
+ - [see notes here](https://huggingface.co/nbroad/p2s-infographic-lg-endpt/blob/main/handler.py#L21-L26) for dtype tradeoffs
12
+
13
+
14
+
15
+
16
+ After deploying the model, inference can be done as follows:
17
+
18
+
19
+ ```python
20
+ import base64
21
+
22
+ with open("path/to/image", "rb") as f:
23
+ b64 = base64.b64encode(f.read())
24
+
25
+ question = "question to model"
26
+
27
+ payload = {
28
+ "inputs": {
29
+ "image": [b64.decode("utf-8")], # for batched inference, send list of images/questions
30
+ "question": [question]
31
+ },
32
+ "parameters":{
33
+ "max_new_tokens": 10, # can use any generation parameters
34
+ }}
35
+
36
+ import requests
37
+
38
+ API_URL = "url_to_endpoint"
39
+ headers = {
40
+ "Authorization": "Bearer HF_TOKEN",
41
+ "Content-Type": "application/json"
42
+ }
43
+
44
+ def query(payload):
45
+ response = requests.post(API_URL, headers=headers, json=payload)
46
+ return response.json()
47
+
48
+ output = query(payload)
49
+
50
+ # {'output': ['55%']}
51
+ ```