VIM-Bench
/

v-mllm-7b

Text Generation

Inference Endpoints

Model card Files Files and versions Community

FunCube commited on Jun 12

Commit

4150ee9

•

1 Parent(s): 579585f

Update README.md

Files changed (1) hide show

README.md +8 -0

README.md CHANGED Viewed

@@ -34,6 +34,14 @@ The primary intended users of the model are researchers in computer vision, natu
 Please kindly cite our paper if you find our resources useful:
 ```
 @misc{lu2023vim,
       title={VIM: Probing Multimodal Large Language Models for Visual Embedded Instruction Following},
       author={Yujie Lu and Xiujun Li and William Yang Wang and Yejin Choi},

 Please kindly cite our paper if you find our resources useful:
 ```
+@misc{li2024text,
+      title={Text as Images: Can Multimodal Large Language Models Follow Printed Instructions in Pixels?},
+      author={Xiujun Li and Yujie Lu and Zhe Gan and Jianfeng Gao and William Yang Wang and Yejin Choi},
+      year={2024},
+      eprint={2311.17647},
+      archivePrefix={arXiv},
+      primaryClass={cs.CV}
+}
 @misc{lu2023vim,
       title={VIM: Probing Multimodal Large Language Models for Visual Embedded Instruction Following},
       author={Yujie Lu and Xiujun Li and William Yang Wang and Yejin Choi},