FunCube commited on
Commit
579585f
1 Parent(s): 9379c3d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +45 -30
README.md CHANGED
@@ -1,30 +1,45 @@
1
- ---
2
- license: llama2
3
- ---
4
-
5
- # v-MLLM Model Card
6
-
7
- ## Model details
8
-
9
- **Model type:**
10
- v-MLLM is an open-source MLLM trained on Visual-Modality Instruction (VIM) corpus, it can robustly follow the text-modality instructions and visual-modality instructions.
11
-
12
- **Model date:**
13
- v-MLLM-7B was trained on January 2024.
14
-
15
- **Github for more information:**
16
- https://github.com/VIM-Bench/VIM_TOOL
17
-
18
- ## License
19
- v-MLLM is licensed under the LLAMA 2 Community License,
20
- Copyright (c) Meta Platforms, Inc. All Rights Reserved.
21
-
22
- ## Intended use
23
- **Primary intended uses:**
24
- The primary use of v-MLLM is research on multimodal large language models.
25
-
26
- **Primary intended users:**
27
- The primary intended users of the model are researchers in computer vision, natural language processing, machine learning, and artificial intelligence.
28
-
29
- ## Training dataset
30
- - 846k VIM corpus based on LVIS-Instruct4V corpus.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: llama2
3
+ ---
4
+
5
+ # v-MLLM Model Card
6
+
7
+ ## Model details
8
+
9
+ **Model type:**
10
+ v-MLLM is an open-source MLLM trained on Visual-Modality Instruction (VIM) corpus, it can robustly follow the text-modality instructions and visual-modality instructions.
11
+
12
+ **Model date:**
13
+ v-MLLM-7B was trained on January 2024.
14
+
15
+ **Github for more information:**
16
+ https://github.com/VIM-Bench/VIM_TOOL
17
+
18
+ ## License
19
+ v-MLLM is licensed under the LLAMA 2 Community License,
20
+ Copyright (c) Meta Platforms, Inc. All Rights Reserved.
21
+
22
+ ## Intended use
23
+ **Primary intended uses:**
24
+ The primary use of v-MLLM is research on multimodal large language models.
25
+
26
+ **Primary intended users:**
27
+ The primary intended users of the model are researchers in computer vision, natural language processing, machine learning, and artificial intelligence.
28
+
29
+ ## Training dataset
30
+ - 846k VIM corpus based on LVIS-Instruct4V corpus.
31
+
32
+ # Citation
33
+
34
+ Please kindly cite our paper if you find our resources useful:
35
+
36
+ ```
37
+ @misc{lu2023vim,
38
+ title={VIM: Probing Multimodal Large Language Models for Visual Embedded Instruction Following},
39
+ author={Yujie Lu and Xiujun Li and William Yang Wang and Yejin Choi},
40
+ year={2023},
41
+ eprint={2311.17647},
42
+ archivePrefix={arXiv},
43
+ primaryClass={cs.CV}
44
+ }
45
+ ```