czczup commited on
Commit
61e81bd
β€’
1 Parent(s): 3afd9f2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -6
README.md CHANGED
@@ -18,7 +18,10 @@ pipeline_tag: visual-question-answering
18
 
19
  > _Two interns holding hands, symbolizing the integration of InternViT and InternLM._
20
 
21
- \[[InternVL 1.5 Technical Report](https://arxiv.org/abs/2404.16821)\] \[[CVPR Paper](https://arxiv.org/abs/2312.14238)\] \[[GitHub](https://github.com/OpenGVLab/InternVL)\] \[[Chat Demo](https://internvl.opengvlab.com/)\] \[[中文解读](https://zhuanlan.zhihu.com/p/675877376)\]
 
 
 
22
 
23
  You can run multimodal large models using a 1080Ti now.
24
 
@@ -48,10 +51,10 @@ As shown in the figure below, we adopted the same model architecture as InternVL
48
 
49
  | Model | Vision Foundation Model | Release Date | Note |
50
  | :----------------------------------------------------------------------------------------------: | :---------------------------------------------------------------------------------------------: | :----------: | :----------------------------------------------------------------------------------------------------------------------------------------------------------------- |
51
- | InternVL-Chat-V1.5(πŸ€— [HF link](https://huggingface.co/OpenGVLab/InternVL-Chat-V1-5)) | InternViT-6B-448px-V1-5(πŸ€— [HF link](https://huggingface.co/OpenGVLab/InternViT-6B-448px-V1-5)) | 2024.04.18 | support 4K image; super strong OCR; Approaching the performance of GPT-4V and Gemini Pro on various benchmarks like MMMU, DocVQA, ChartQA, MathVista, etc. (πŸ”₯new) |
52
- | InternVL-Chat-V1.2-Plus(πŸ€— [HF link](https://huggingface.co/OpenGVLab/InternVL-Chat-V1-2-Plus) ) | InternViT-6B-448px-V1-2(πŸ€— [HF link](https://huggingface.co/OpenGVLab/InternViT-6B-448px-V1-2)) | 2024.02.21 | more SFT data and stronger |
53
- | InternVL-Chat-V1.2(πŸ€— [HF link](https://huggingface.co/OpenGVLab/InternVL-Chat-V1-2) ) | InternViT-6B-448px-V1-2(πŸ€— [HF link](https://huggingface.co/OpenGVLab/InternViT-6B-448px-V1-2)) | 2024.02.11 | scaling up LLM to 34B |
54
- | InternVL-Chat-V1.1(πŸ€— [HF link](https://huggingface.co/OpenGVLab/InternVL-Chat-V1-1)) | InternViT-6B-448px-V1-0(πŸ€— [HF link](https://huggingface.co/OpenGVLab/InternViT-6B-448px-V1-0)) | 2024.01.24 | support Chinese and stronger OCR |
55
 
56
  ## Performance
57
 
@@ -59,7 +62,7 @@ As shown in the figure below, we adopted the same model architecture as InternVL
59
 
60
  ## Model Usage
61
 
62
- We provide an example code to run Mini-InternVL-Chat-2B-V1.5 using `transformers`.
63
 
64
  You can also use our [online demo](https://internvl.opengvlab.com/) for a quick experience of this model.
65
 
 
18
 
19
  > _Two interns holding hands, symbolizing the integration of InternViT and InternLM._
20
 
21
+ [\[πŸ†• Blog\]](https://internvl.github.io/blog/) [\[πŸ“œ InternVL 1.0 Paper\]](https://arxiv.org/abs/2312.14238) [\[πŸ“œ InternVL 1.5 Report\]](https://arxiv.org/abs/2404.16821) [\[πŸ—¨οΈ Chat Demo\]](https://internvl.opengvlab.com/)
22
+
23
+ [\[πŸ€— HF Demo\]](https://huggingface.co/spaces/OpenGVLab/InternVL) [\[πŸš€ Quick Start\]](#model-usage) [\[🌐 Community-hosted API\]](https://rapidapi.com/adushar1320/api/internvl-chat) [\[πŸ“– 中文解读\]](https://zhuanlan.zhihu.com/p/675877376)
24
+
25
 
26
  You can run multimodal large models using a 1080Ti now.
27
 
 
51
 
52
  | Model | Vision Foundation Model | Release Date | Note |
53
  | :----------------------------------------------------------------------------------------------: | :---------------------------------------------------------------------------------------------: | :----------: | :----------------------------------------------------------------------------------------------------------------------------------------------------------------- |
54
+ | InternVL-Chat-V1-5(πŸ€— [HF link](https://huggingface.co/OpenGVLab/InternVL-Chat-V1-5)) | InternViT-6B-448px-V1-5(πŸ€— [HF link](https://huggingface.co/OpenGVLab/InternViT-6B-448px-V1-5)) | 2024.04.18 | support 4K image; super strong OCR; Approaching the performance of GPT-4V and Gemini Pro on various benchmarks like MMMU, DocVQA, ChartQA, MathVista, etc. (πŸ”₯new) |
55
+ | InternVL-Chat-V1-2-Plus(πŸ€— [HF link](https://huggingface.co/OpenGVLab/InternVL-Chat-V1-2-Plus) ) | InternViT-6B-448px-V1-2(πŸ€— [HF link](https://huggingface.co/OpenGVLab/InternViT-6B-448px-V1-2)) | 2024.02.21 | more SFT data and stronger |
56
+ | InternVL-Chat-V1-2(πŸ€— [HF link](https://huggingface.co/OpenGVLab/InternVL-Chat-V1-2) ) | InternViT-6B-448px-V1-2(πŸ€— [HF link](https://huggingface.co/OpenGVLab/InternViT-6B-448px-V1-2)) | 2024.02.11 | scaling up LLM to 34B |
57
+ | InternVL-Chat-V1-1(πŸ€— [HF link](https://huggingface.co/OpenGVLab/InternVL-Chat-V1-1)) | InternViT-6B-448px-V1-0(πŸ€— [HF link](https://huggingface.co/OpenGVLab/InternViT-6B-448px-V1-0)) | 2024.01.24 | support Chinese and stronger OCR |
58
 
59
  ## Performance
60
 
 
62
 
63
  ## Model Usage
64
 
65
+ We provide an example code to run Mini-InternVL-Chat-2B-V1-5 using `transformers`.
66
 
67
  You can also use our [online demo](https://internvl.opengvlab.com/) for a quick experience of this model.
68