xinyu1205 commited on
Commit
a4f945f
1 Parent(s): c457cc4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -7
README.md CHANGED
@@ -14,8 +14,7 @@ Model card for <a href="https://recognize-anything.github.io/">Recognize Anythin
14
  **Recognition and localization are two foundation computer vision tasks.**
15
  - **The Segment Anything Model (SAM)** excels in **localization capabilities**, while it falls short when it comes to **recognition tasks**.
16
  - **The Recognize Anything Model (RAM) and Tag2Text** exhibits **exceptional recognition abilities**, in terms of **both accuracy and scope**.
17
-
18
-
19
  | ![RAM.jpg](https://github.com/xinyu1205/Tag2Text/raw/main/images/localization_and_recognition.jpg) |
20
  |:--:|
21
  | <b> Pull figure from recognize-anything official repo | Image source: https://recognize-anything.github.io/ </b>|
@@ -38,14 +37,10 @@ Authors from the [paper](https://arxiv.org/abs/2306.03514) write in the abstract
38
  }
39
 
40
  @article{huang2023tag2text,
 
41
  title={Tag2Text: Guiding Vision-Language Model via Image Tagging},
42
  author={Huang, Xinyu and Zhang, Youcai and Ma, Jinyu and Tian, Weiwei and Feng, Rui and Zhang, Yuejie and Li, Yaqian and Guo, Yandong and Zhang, Lei},
43
  journal={arXiv preprint arXiv:2303.05657},
44
  year={2023}
45
  }
46
  ```
47
-
48
-
49
-
50
-
51
-
 
14
  **Recognition and localization are two foundation computer vision tasks.**
15
  - **The Segment Anything Model (SAM)** excels in **localization capabilities**, while it falls short when it comes to **recognition tasks**.
16
  - **The Recognize Anything Model (RAM) and Tag2Text** exhibits **exceptional recognition abilities**, in terms of **both accuracy and scope**.
17
+ -
 
18
  | ![RAM.jpg](https://github.com/xinyu1205/Tag2Text/raw/main/images/localization_and_recognition.jpg) |
19
  |:--:|
20
  | <b> Pull figure from recognize-anything official repo | Image source: https://recognize-anything.github.io/ </b>|
 
37
  }
38
 
39
  @article{huang2023tag2text,
40
+
41
  title={Tag2Text: Guiding Vision-Language Model via Image Tagging},
42
  author={Huang, Xinyu and Zhang, Youcai and Ma, Jinyu and Tian, Weiwei and Feng, Rui and Zhang, Yuejie and Li, Yaqian and Guo, Yandong and Zhang, Lei},
43
  journal={arXiv preprint arXiv:2303.05657},
44
  year={2023}
45
  }
46
  ```