yushihu commited on
Commit
44aa161
1 Parent(s): a5c6185

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -1
README.md CHANGED
@@ -16,6 +16,13 @@ language:
16
  - en
17
 
18
  ---
 
 
 
 
 
 
 
19
 
20
  # QuickStart
21
 
@@ -112,7 +119,7 @@ print(vqa_model.vqa_multiple_choice(question, image, choices))
112
  ## Bibtex
113
  ```
114
  @article{hu2022promptcap,
115
- title={PromptCap: Prompt-Guided Image Captioning for VQA with GPT-3},
116
  author={Hu, Yushi and Hua, Hang and Yang, Zhengyuan and Shi, Weijia and Smith, Noah A and Luo, Jiebo},
117
  journal={arXiv preprint arXiv:2211.09699},
118
  year={2022}
 
16
  - en
17
 
18
  ---
19
+ This is the repo for the paper [PromptCap: Prompt-Guided Image Captioning for VQA with GPT-3](https://arxiv.org/abs/2211.09699)
20
+
21
+ We introduce PromptCap, a captioning model that can be controlled by natural language instruction. The instruction may contain a question that the user is interested in.
22
+ For example, "what is the boy putting on?". PromptCap also supports generic caption, using the question "what does the image describe?"
23
+
24
+ PromptCap can be served as a light-weight visual plug-in for LLM like GPT-3 and ChatGPT. It achieves SOTA performance on COCO captioning (150 CIDEr).
25
+ When paired with GPT-3, and conditioned on user question, PromptCap get SOTA performance on knowledge-based VQA tasks (60.4% on OK-VQA and 59.6% on A-OKVQA)
26
 
27
  # QuickStart
28
 
 
119
  ## Bibtex
120
  ```
121
  @article{hu2022promptcap,
122
+ title={PromptCap: Prompt-Guided Task-Aware Image Captioning},
123
  author={Hu, Yushi and Hua, Hang and Yang, Zhengyuan and Shi, Weijia and Smith, Noah A and Luo, Jiebo},
124
  journal={arXiv preprint arXiv:2211.09699},
125
  year={2022}