koajoel commited on
Commit
495976e
1 Parent(s): e5575e6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -0
README.md CHANGED
@@ -1,3 +1,34 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+ # PolyFormer: Referring Image Segmentation as Sequential Polygon Generation
5
+
6
+ [Project](https://polyformer.github.io/) | [GitHub](https://github.com/amazon-science/polygon-transformer) | [Demo](https://huggingface.co/spaces/koajoel/PolyFormer)
7
+
8
+ ## Model description
9
+
10
+ PolyFormer is a unified framework for referring image segmentation (RIS) and referring expression comprehension (REC) by formulating them as a sequence-to-sequence (seq2seq) prediction problem. For more details, please refer to our paper:
11
+
12
+ [PolyFormer: Referring Image Segmentation as Sequential Polygon Generation](https://arxiv.org/abs/2302.07387)
13
+ Jiang Liu*, Hui Ding*, Zhaowei Cai, Yuting Zhang, Ravi Kumar Satzoda, Vijay Mahadevan, R. Manmatha, [CVPR 2023](https://cvpr2023.thecvf.com/Conferences/2023/AcceptedPapers)
14
+
15
+ ## Training data
16
+
17
+ We pre-train PolyFormer on the REC task using Visual Genome, RefCOCO, RefCOCO+, RefCOCOg, and Flickr30k-entities, and the finetune on REC + RIS task using RefCOCO, RefCOCO+,
18
+ and RefCOCOg.
19
+
20
+ * PolyFormer-B: Swin-B as the visual encoder, BERT-base as the text encoder, 6 transformer encoder layers and 6 decoder layers.
21
+ * PolyFormer-L: Swin-L as the visual encoder, BERT-base as the text encoder, 12 transformer encoder layers and 12 decoder layers.
22
+
23
+ ## Citation
24
+
25
+ If you find PolyFormer useful in your research, please cite the following paper:
26
+
27
+ ``` latex
28
+ @article{liu2023polyformer,
29
+ title={PolyFormer: Referring Image Segmentation as Sequential Polygon Generation},
30
+ author={Liu, Jiang and Ding, Hui and Cai, Zhaowei and Zhang, Yuting and Satzoda, Ravi Kumar and Mahadevan, Vijay and Manmatha, R},
31
+ journal={arXiv preprint arXiv:2302.07387},
32
+ year={2023}
33
+ }
34
+ ```