Update README.md
Browse files
README.md
CHANGED
@@ -65,7 +65,7 @@ To construct this dataset, we propose an efficient data construction pipeline. S
|
|
65 |
|
66 |
- **For samples with clear ground truths:**
|
67 |
the model is prompted to first provide the reasoning process and then give the final answer in the format like `Final Answer: ***`.
|
68 |
-
Responses matching the ground truth answer constitute the positive set \\(mathcal{Y}_p\\), while those that do not match make up the negative set \\(\mathcal{Y}_n\\). Additionally, responses that fail to provide a clear final answer are also merged into \\(\mathcal{Y}_n\\).
|
69 |
Given these responses labeled as positive or negative, we build the preference pairs by selecting a chosen response \\(y_c\\) from \\(\mathcal{Y}_p\\) and a negative response \\(y_r\\) from \\(\mathcal{Y}_n\\).
|
70 |
|
71 |
- **For samples without clear ground truths:**
|
@@ -160,7 +160,7 @@ To comprehensively compare InternVL's performance before and after MPO, we emplo
|
|
160 |
|
161 |
## Quick Start
|
162 |
|
163 |
-
We provide an example code to run `InternVL2_5-
|
164 |
|
165 |
> Please use transformers>=4.37.2 to ensure the model works normally.
|
166 |
|
@@ -171,7 +171,7 @@ We provide an example code to run `InternVL2_5-1B` using `transformers`.
|
|
171 |
```python
|
172 |
import torch
|
173 |
from transformers import AutoTokenizer, AutoModel
|
174 |
-
path = "OpenGVLab/InternVL2_5-
|
175 |
model = AutoModel.from_pretrained(
|
176 |
path,
|
177 |
torch_dtype=torch.bfloat16,
|
@@ -185,7 +185,7 @@ model = AutoModel.from_pretrained(
|
|
185 |
```python
|
186 |
import torch
|
187 |
from transformers import AutoTokenizer, AutoModel
|
188 |
-
path = "OpenGVLab/InternVL2_5-
|
189 |
model = AutoModel.from_pretrained(
|
190 |
path,
|
191 |
torch_dtype=torch.bfloat16,
|
@@ -230,8 +230,8 @@ def split_model(model_name):
|
|
230 |
|
231 |
return device_map
|
232 |
|
233 |
-
path = "OpenGVLab/InternVL2_5-
|
234 |
-
device_map = split_model('InternVL2_5-
|
235 |
model = AutoModel.from_pretrained(
|
236 |
path,
|
237 |
torch_dtype=torch.bfloat16,
|
@@ -327,7 +327,7 @@ def load_image(image_file, input_size=448, max_num=12):
|
|
327 |
return pixel_values
|
328 |
|
329 |
# If you want to load a model using multiple GPUs, please refer to the `Multiple GPUs` section.
|
330 |
-
path = 'OpenGVLab/InternVL2_5-
|
331 |
model = AutoModel.from_pretrained(
|
332 |
path,
|
333 |
torch_dtype=torch.bfloat16,
|
@@ -510,7 +510,7 @@ LMDeploy abstracts the complex inference process of multi-modal Vision-Language
|
|
510 |
from lmdeploy import pipeline, TurbomindEngineConfig
|
511 |
from lmdeploy.vl import load_image
|
512 |
|
513 |
-
model = 'OpenGVLab/InternVL2_5-
|
514 |
image = load_image('https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/tests/data/tiger.jpeg')
|
515 |
pipe = pipeline(model, backend_config=TurbomindEngineConfig(session_len=8192))
|
516 |
response = pipe(('describe this image', image))
|
@@ -528,7 +528,7 @@ from lmdeploy import pipeline, TurbomindEngineConfig
|
|
528 |
from lmdeploy.vl import load_image
|
529 |
from lmdeploy.vl.constants import IMAGE_TOKEN
|
530 |
|
531 |
-
model = 'OpenGVLab/InternVL2_5-
|
532 |
pipe = pipeline(model, backend_config=TurbomindEngineConfig(session_len=8192))
|
533 |
|
534 |
image_urls=[
|
@@ -550,7 +550,7 @@ Conducting inference with batch prompts is quite straightforward; just place the
|
|
550 |
from lmdeploy import pipeline, TurbomindEngineConfig
|
551 |
from lmdeploy.vl import load_image
|
552 |
|
553 |
-
model = 'OpenGVLab/InternVL2_5-
|
554 |
pipe = pipeline(model, backend_config=TurbomindEngineConfig(session_len=8192))
|
555 |
|
556 |
image_urls=[
|
@@ -570,7 +570,7 @@ There are two ways to do the multi-turn conversations with the pipeline. One is
|
|
570 |
from lmdeploy import pipeline, TurbomindEngineConfig, GenerationConfig
|
571 |
from lmdeploy.vl import load_image
|
572 |
|
573 |
-
model = 'OpenGVLab/InternVL2_5-
|
574 |
pipe = pipeline(model, backend_config=TurbomindEngineConfig(session_len=8192))
|
575 |
|
576 |
image = load_image('https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/demo/resources/human-pose.jpg')
|
@@ -586,7 +586,7 @@ print(sess.response.text)
|
|
586 |
LMDeploy's `api_server` enables models to be easily packed into services with a single command. The provided RESTful APIs are compatible with OpenAI's interfaces. Below are an example of service startup:
|
587 |
|
588 |
```shell
|
589 |
-
lmdeploy serve api_server OpenGVLab/InternVL2_5-
|
590 |
```
|
591 |
|
592 |
To use the OpenAI-style interface, you need to install OpenAI:
|
@@ -625,7 +625,7 @@ print(response)
|
|
625 |
|
626 |
## License
|
627 |
|
628 |
-
This project is released under the MIT License. This project uses the pre-trained Qwen2.5-
|
629 |
|
630 |
## Citation
|
631 |
|
|
|
65 |
|
66 |
- **For samples with clear ground truths:**
|
67 |
the model is prompted to first provide the reasoning process and then give the final answer in the format like `Final Answer: ***`.
|
68 |
+
Responses matching the ground truth answer constitute the positive set \\(\mathcal{Y}_p\\), while those that do not match make up the negative set \\(\mathcal{Y}_n\\). Additionally, responses that fail to provide a clear final answer are also merged into \\(\mathcal{Y}_n\\).
|
69 |
Given these responses labeled as positive or negative, we build the preference pairs by selecting a chosen response \\(y_c\\) from \\(\mathcal{Y}_p\\) and a negative response \\(y_r\\) from \\(\mathcal{Y}_n\\).
|
70 |
|
71 |
- **For samples without clear ground truths:**
|
|
|
160 |
|
161 |
## Quick Start
|
162 |
|
163 |
+
We provide an example code to run `InternVL2_5-4B-MPO` using `transformers`.
|
164 |
|
165 |
> Please use transformers>=4.37.2 to ensure the model works normally.
|
166 |
|
|
|
171 |
```python
|
172 |
import torch
|
173 |
from transformers import AutoTokenizer, AutoModel
|
174 |
+
path = "OpenGVLab/InternVL2_5-4B-MPO"
|
175 |
model = AutoModel.from_pretrained(
|
176 |
path,
|
177 |
torch_dtype=torch.bfloat16,
|
|
|
185 |
```python
|
186 |
import torch
|
187 |
from transformers import AutoTokenizer, AutoModel
|
188 |
+
path = "OpenGVLab/InternVL2_5-4B-MPO"
|
189 |
model = AutoModel.from_pretrained(
|
190 |
path,
|
191 |
torch_dtype=torch.bfloat16,
|
|
|
230 |
|
231 |
return device_map
|
232 |
|
233 |
+
path = "OpenGVLab/InternVL2_5-4B-MPO"
|
234 |
+
device_map = split_model('InternVL2_5-4B')
|
235 |
model = AutoModel.from_pretrained(
|
236 |
path,
|
237 |
torch_dtype=torch.bfloat16,
|
|
|
327 |
return pixel_values
|
328 |
|
329 |
# If you want to load a model using multiple GPUs, please refer to the `Multiple GPUs` section.
|
330 |
+
path = 'OpenGVLab/InternVL2_5-4B-MPO'
|
331 |
model = AutoModel.from_pretrained(
|
332 |
path,
|
333 |
torch_dtype=torch.bfloat16,
|
|
|
510 |
from lmdeploy import pipeline, TurbomindEngineConfig
|
511 |
from lmdeploy.vl import load_image
|
512 |
|
513 |
+
model = 'OpenGVLab/InternVL2_5-4B-MPO'
|
514 |
image = load_image('https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/tests/data/tiger.jpeg')
|
515 |
pipe = pipeline(model, backend_config=TurbomindEngineConfig(session_len=8192))
|
516 |
response = pipe(('describe this image', image))
|
|
|
528 |
from lmdeploy.vl import load_image
|
529 |
from lmdeploy.vl.constants import IMAGE_TOKEN
|
530 |
|
531 |
+
model = 'OpenGVLab/InternVL2_5-4B-MPO'
|
532 |
pipe = pipeline(model, backend_config=TurbomindEngineConfig(session_len=8192))
|
533 |
|
534 |
image_urls=[
|
|
|
550 |
from lmdeploy import pipeline, TurbomindEngineConfig
|
551 |
from lmdeploy.vl import load_image
|
552 |
|
553 |
+
model = 'OpenGVLab/InternVL2_5-4B-MPO'
|
554 |
pipe = pipeline(model, backend_config=TurbomindEngineConfig(session_len=8192))
|
555 |
|
556 |
image_urls=[
|
|
|
570 |
from lmdeploy import pipeline, TurbomindEngineConfig, GenerationConfig
|
571 |
from lmdeploy.vl import load_image
|
572 |
|
573 |
+
model = 'OpenGVLab/InternVL2_5-4B-MPO'
|
574 |
pipe = pipeline(model, backend_config=TurbomindEngineConfig(session_len=8192))
|
575 |
|
576 |
image = load_image('https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/demo/resources/human-pose.jpg')
|
|
|
586 |
LMDeploy's `api_server` enables models to be easily packed into services with a single command. The provided RESTful APIs are compatible with OpenAI's interfaces. Below are an example of service startup:
|
587 |
|
588 |
```shell
|
589 |
+
lmdeploy serve api_server OpenGVLab/InternVL2_5-4B-MPO --server-port 23333
|
590 |
```
|
591 |
|
592 |
To use the OpenAI-style interface, you need to install OpenAI:
|
|
|
625 |
|
626 |
## License
|
627 |
|
628 |
+
This project is released under the MIT License. This project uses the pre-trained Qwen2.5-3B-Instruct as a component, which is licensed under the Apache License 2.0.
|
629 |
|
630 |
## Citation
|
631 |
|