myownskyW7 commited on
Commit
9ca19f4
1 Parent(s): 4e46c2c

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +36 -0
README.md ADDED
@@ -0,0 +1,36 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ pipeline_tag: text-generation
4
+ ---
5
+
6
+
7
+ <p align="center">
8
+ <img src="logo.png" width="400"/>
9
+ <p>
10
+
11
+ <p align="center">
12
+ <b><font size="6">InternLM-XComposer</font></b>
13
+ <p>
14
+
15
+ <div align="center">
16
+
17
+ [💻Github Repo](https://github.com/InternLM/InternLM-XComposer)
18
+
19
+ </div>
20
+
21
+ **InternLM-XComposer** is a vision-language large model (VLLM) based on [InternLM](https://github.com/InternLM/InternLM/tree/main) for advanced text-image comprehension and composition. InternLM-XComposer has serveal appealing properties:
22
+
23
+ - **Interleaved Text-Image Composition**: InternLM-XComposer can effortlessly generate coherent and contextual articles that seamlessly integrate images, providing a more engaging and immersive reading experience. The interleaved text-image composition is implemented in following steps:
24
+
25
+ 1. **Text Generation**: It crafts long-form text based on human-provided instructions.
26
+ 2. **Image Spoting and Captioning**: It pinpoints optimal locations for image placement and furnishes image descriptions.
27
+ 3. **Image Retrieval and Selection**: It select image candidates and identify the image that optimally complements the content.
28
+
29
+ - **Comprehension with Rich Multilingual Knowledge**: The text-image comprehension is empowered by training on extensive multi-modal multilingual concepts with carefully crafted strategies, resulting in a deep understanding of visual content.
30
+ - **Strong performance**: It consistently achieves state-of-the-art results across various benchmarks for vision-language large models, including [MME Benchmark](https://github.com/BradyFU/Awesome-Multimodal-Large-Language-Models/tree/Evaluation) (English), [MMBench](https://opencompass.org.cn/leaderboard-multimodal) (English), [Seed-Bench](https://huggingface.co/spaces/AILab-CVC/SEED-Bench_Leaderboard) (English), [CCBench](https://opencompass.org.cn/leaderboard-multimodal)(Chinese), and [MMBench-CN](https://opencompass.org.cn/leaderboard-multimodal) (Chineese).
31
+
32
+ We release InternLM-XComposer series in two versions:
33
+
34
+ - InternLM-XComposer-VL: The pretrained VLLM model with InternLM as the initialization of the LLM, achieving strong performance on various multimodal benchmarks, e.g., MME Benchmark, MMBench Seed-Bench, CCBench, and MMBench-CN.
35
+ - InternLM-XComposer: The finetuned VLLM for *Interleaved Text-Image Composition* and *LLM-based AI assistant*.
36
+ <br>