Spaces:

Infi-MM
/

README

Running

App Files Files Community

xiaotianhan commited on Aug 15, 2024

Commit

743b1c4

verified ·

1 Parent(s): 6dbdea3

Update README.md

Browse files

Files changed (1) hide show

README.md +33 -1

README.md CHANGED Viewed

@@ -33,7 +33,39 @@ In particular, InfiMM integrates the latest LLM models into VLM domain the revea
 Please note that InfiMM is currently in beta stage and we are continuously working on improving it.
 ## News
 - 🎉 **[2024.03.02]** We release the [InfiMM-HD](https://huggingface.co/Infi-MM/infimm-hd).
 - 🎉 **[2024.01.11]** We release the first set of MLLMs we trained [InfiMM-Zephyr](https://huggingface.co/Infi-MM/infimm-zephyr), [InfiMM-LLaMA13B](https://huggingface.co/Infi-MM/infimm-llama13b) and [InfiMM-Vicuna13B](https://huggingface.co/Infi-MM/infimm-vicuna13b).
 - 🎉 **[2024.01.10]** We release a survey about Multimodal Large Language Models (MLLMs) reasoning capability at [here](https://huggingface.co/papers/2401.06805).
-- 🎉 **[2023.11.18]** We release InfiMM-Eval at [here](https://arxiv.org/abs/2311.11567), an Open-ended VQA benchmark dataset specifically designed for MLLMs, with a focus on complex reasoning tasks. The leaderboard can be found via [Papers with Code](https://paperswithcode.com/sota/visual-question-answering-vqa-on-core-mm) or [project page](https://infimm.github.io/InfiMM-Eval/).

 Please note that InfiMM is currently in beta stage and we are continuously working on improving it.
 ## News
+- 🎉 **[2024.08.15]** Our paper was accepted by ACL 2023 [InfiMM](https://aclanthology.org/2024.findings-acl.27/).
 - 🎉 **[2024.03.02]** We release the [InfiMM-HD](https://huggingface.co/Infi-MM/infimm-hd).
 - 🎉 **[2024.01.11]** We release the first set of MLLMs we trained [InfiMM-Zephyr](https://huggingface.co/Infi-MM/infimm-zephyr), [InfiMM-LLaMA13B](https://huggingface.co/Infi-MM/infimm-llama13b) and [InfiMM-Vicuna13B](https://huggingface.co/Infi-MM/infimm-vicuna13b).
 - 🎉 **[2024.01.10]** We release a survey about Multimodal Large Language Models (MLLMs) reasoning capability at [here](https://huggingface.co/papers/2401.06805).
+- 🎉 **[2023.11.18]** We release InfiMM-Eval at [here](https://arxiv.org/abs/2311.11567), an Open-ended VQA benchmark dataset specifically designed for MLLMs, with a focus on complex reasoning tasks. The leaderboard can be found via [Papers with Code](https://paperswithcode.com/sota/visual-question-answering-vqa-on-core-mm) or [project page](https://infimm.github.io/InfiMM-Eval/).
+## Citation
+```
+@inproceedings{liu-etal-2024-infimm,
+    title = "{I}nfi{MM}: Advancing Multimodal Understanding with an Open-Sourced Visual Language Model",
+    author = "Liu, Haogeng  and
+      You, Quanzeng  and
+      Wang, Yiqi  and
+      Han, Xiaotian  and
+      Zhai, Bohan  and
+      Liu, Yongfei  and
+      Chen, Wentao  and
+      Jian, Yiren  and
+      Tao, Yunzhe  and
+      Yuan, Jianbo  and
+      He, Ran  and
+      Yang, Hongxia",
+    editor = "Ku, Lun-Wei  and
+      Martins, Andre  and
+      Srikumar, Vivek",
+    booktitle = "Findings of the Association for Computational Linguistics ACL 2024",
+    month = aug,
+    year = "2024",
+    address = "Bangkok, Thailand and virtual meeting",
+    publisher = "Association for Computational Linguistics",
+    url = "https://aclanthology.org/2024.findings-acl.27",
+    pages = "485--492",
+    abstract = "In this work, we present InfiMM, an advanced Multimodal Large Language Model that adapts to intricate vision-language tasks. InfiMM, inspired by the Flamingo architecture, distinguishes itself through the utilization of large-scale training data, comprehensive training strategies, and diverse large language models. This approach ensures the preservation of Flamingo{'}s foundational strengths while simultaneously introducing augmented capabilities. Empirical evaluations across a variety of benchmarks underscore InfiMM{'}s remarkable capability in multimodal understanding. The code can be found at: https://anonymous.4open.science/r/infimm-zephyr-F60C/.",
+}
+```