Update README.md
Browse files
README.md
CHANGED
@@ -11,9 +11,9 @@ library_name: transformers
|
|
11 |
|
12 |
<!-- Provide a quick summary of what the model is/does. -->
|
13 |
|
14 |
-
Llama-3.2V-11B-cot is the first version of [LLaVA-
|
15 |
|
16 |
-
The model was proposed in [LLaVA-
|
17 |
|
18 |
## Model Details
|
19 |
|
@@ -61,7 +61,7 @@ You can use the inference code for Llama-3.2-11B-Vision-Instruct.
|
|
61 |
|
62 |
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
|
63 |
|
64 |
-
The model is trained on the LLaVA-
|
65 |
|
66 |
### Training Procedure
|
67 |
|
|
|
11 |
|
12 |
<!-- Provide a quick summary of what the model is/does. -->
|
13 |
|
14 |
+
Llama-3.2V-11B-cot is the first version of [LLaVA-CoT](https://github.com/PKU-YuanGroup/LLaVA-CoT), which is a visual language model capable of spontaneous, systematic reasoning.
|
15 |
|
16 |
+
The model was proposed in [LLaVA-CoT: Let Vision Language Models Reason Step-by-Step](https://huggingface.co/papers/2411.10440).
|
17 |
|
18 |
## Model Details
|
19 |
|
|
|
61 |
|
62 |
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
|
63 |
|
64 |
+
The model is trained on the [LLaVA-CoT-100k dataset](https://huggingface.co/datasets/Xkev/LLaVA-CoT-100k).
|
65 |
|
66 |
### Training Procedure
|
67 |
|