OpenGVLab
/

InternVL-Chat-V1-1

@@ -17,12 +17,12 @@ pipeline_tag: visual-question-answering
 \[[Paper](https://arxiv.org/abs/2312.14238)\]  \[[GitHub](https://github.com/OpenGVLab/InternVL)\] \[[Chat Demo](https://internvl.opengvlab.com/)\] \[[中文解读](https://zhuanlan.zhihu.com/p/675877376)]
-| Model                   | Date       | Download                                                                             | Note                               |
-| ----------------------- | ---------- | ------------------------------------------------------------------------------------ | ---------------------------------- |
-| InternVL-Chat-V1.5      | 2024.04.18 | 🤗 [HF link](https://huggingface.co/OpenGVLab/InternVL-Chat-V1-5)                    | support 4K image; super strong OCR; Approaching the performance of GPT-4V and Gemini Pro on various benchmarks like MMMU, DocVQA, ChartQA, MathVista, etc. (🔥new)|
-| InternVL-Chat-V1.2-Plus | 2024.02.21 | 🤗 [HF link](https://huggingface.co/OpenGVLab/InternVL-Chat-Chinese-V1-2-Plus)       | more SFT data and stronger  |
-| InternVL-Chat-V1.2      | 2024.02.11 | 🤗 [HF link](https://huggingface.co/OpenGVLab/InternVL-Chat-Chinese-V1-2)            | scaling up LLM to 34B       |
-| InternVL-Chat-V1.1      | 2024.01.24 | 🤗 [HF link](https://huggingface.co/OpenGVLab/InternVL-Chat-Chinese-V1-1)            | support Chinese and stronger OCR   |
 ## Model Details
 - **Model Type:** multimodal large language model (MLLM)
@@ -53,7 +53,7 @@ from PIL import Image
 from transformers import AutoModel, CLIPImageProcessor
 from transformers import AutoTokenizer
-path = "OpenGVLab/InternVL-Chat-Chinese-V1-1"
 # If your GPU has more than 40G memory, you can put the entire model on a single GPU.
 model = AutoModel.from_pretrained(
     path,

 \[[Paper](https://arxiv.org/abs/2312.14238)\]  \[[GitHub](https://github.com/OpenGVLab/InternVL)\] \[[Chat Demo](https://internvl.opengvlab.com/)\] \[[中文解读](https://zhuanlan.zhihu.com/p/675877376)]
+| Model                   | Date       | Download                                                                    | Note                               |
+| ----------------------- | ---------- | --------------------------------------------------------------------------- | ---------------------------------- |
+| InternVL-Chat-V1.5      | 2024.04.18 | 🤗 [HF link](https://huggingface.co/OpenGVLab/InternVL-Chat-V1-5)            | support 4K image; super strong OCR; Approaching the performance of GPT-4V and Gemini Pro on various benchmarks like MMMU, DocVQA, ChartQA, MathVista, etc. (🔥new)|
+| InternVL-Chat-V1.2-Plus | 2024.02.21 | 🤗 [HF link](https://huggingface.co/OpenGVLab/InternVL-Chat-V1-2-Plus)       | more SFT data and stronger  |
+| InternVL-Chat-V1.2      | 2024.02.11 | 🤗 [HF link](https://huggingface.co/OpenGVLab/InternVL-Chat-V1-2)            | scaling up LLM to 34B       |
+| InternVL-Chat-V1.1      | 2024.01.24 | 🤗 [HF link](https://huggingface.co/OpenGVLab/InternVL-Chat-V1-1)            | support Chinese and stronger OCR   |
 ## Model Details
 - **Model Type:** multimodal large language model (MLLM)
 from transformers import AutoModel, CLIPImageProcessor
 from transformers import AutoTokenizer
+path = "OpenGVLab/InternVL-Chat-V1-1"
 # If your GPU has more than 40G memory, you can put the entire model on a single GPU.
 model = AutoModel.from_pretrained(
     path,