yushihu commited on
Commit
5eb3f77
1 Parent(s): 7cef707

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -7
README.md CHANGED
@@ -26,13 +26,13 @@ pip install promptcap
26
 
27
  ## Captioning Pipeline
28
 
29
-
30
- Generate a prompt-guided caption by following:
31
  ```python
32
  import torch
33
  from promptcap import PromptCap
34
 
35
- model = PromptCap("vqascore/promptcap-coco-vqa") # also support OFA checkpoints. e.g. "OFA-Sys/ofa-base"
36
 
37
  if torch.cuda.is_available():
38
  model.cuda()
@@ -47,7 +47,7 @@ To try generic captioning, just use "please describe this image according to the
47
 
48
  PromptCap also support taking OCR inputs:
49
 
50
- ```
51
  prompt = "please describe this image according to the given question: what year was this taken?"
52
  image = "dvds.jpg"
53
  ocr = "yip AE Mht juor 02/14/2012"
@@ -62,7 +62,7 @@ print(model.caption(prompt, image, ocr))
62
  Different from typical VQA models, which are doing classification on VQAv2, PromptCap is open-domain and can be paired with arbitrary text-QA models.
63
  Here we provide a pipeline for combining PromptCap with UnifiedQA.
64
 
65
- ```
66
  import torch
67
  from promptcap import PromptCap_VQA
68
 
@@ -80,7 +80,7 @@ print(vqa_model.vqa(question, image))
80
 
81
  Similarly, PromptCap supports OCR inputs
82
 
83
- ```
84
  question = "what year was this taken?"
85
  image = "dvds.jpg"
86
  ocr = "yip AE Mht juor 02/14/2012"
@@ -90,7 +90,7 @@ print(vqa_model.vqa(prompt, image, ocr=ocr))
90
 
91
  Because of the flexibility of Unifiedqa, PromptCap also supports multiple-choice VQA
92
 
93
- ```
94
  question = "what piece of clothing is this boy putting on?"
95
  image = "glove_boy.jpeg"
96
  choices = ["gloves", "socks", "shoes", "coats"]
 
26
 
27
  ## Captioning Pipeline
28
 
29
+ Please follow the prompt format, which will give the best performance.
30
+ Generate a prompt-guided caption by following
31
  ```python
32
  import torch
33
  from promptcap import PromptCap
34
 
35
+ model = PromptCap("vqascore/promptcap-coco-vqa") # also support OFA checkpoints. e.g. "OFA-Sys/ofa-large"
36
 
37
  if torch.cuda.is_available():
38
  model.cuda()
 
47
 
48
  PromptCap also support taking OCR inputs:
49
 
50
+ ```python
51
  prompt = "please describe this image according to the given question: what year was this taken?"
52
  image = "dvds.jpg"
53
  ocr = "yip AE Mht juor 02/14/2012"
 
62
  Different from typical VQA models, which are doing classification on VQAv2, PromptCap is open-domain and can be paired with arbitrary text-QA models.
63
  Here we provide a pipeline for combining PromptCap with UnifiedQA.
64
 
65
+ ```python
66
  import torch
67
  from promptcap import PromptCap_VQA
68
 
 
80
 
81
  Similarly, PromptCap supports OCR inputs
82
 
83
+ ```python
84
  question = "what year was this taken?"
85
  image = "dvds.jpg"
86
  ocr = "yip AE Mht juor 02/14/2012"
 
90
 
91
  Because of the flexibility of Unifiedqa, PromptCap also supports multiple-choice VQA
92
 
93
+ ```python
94
  question = "what piece of clothing is this boy putting on?"
95
  image = "glove_boy.jpeg"
96
  choices = ["gloves", "socks", "shoes", "coats"]