RaushanTurganbay HF staff commited on
Commit
1d94039
1 Parent(s): 81cbb4d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -1
README.md CHANGED
@@ -2,6 +2,10 @@
2
  tags:
3
  - vision
4
  - image-text-to-text
 
 
 
 
5
  ---
6
 
7
  # LLaVa-Next, leveraging [liuhaotian/llava-v1.6-vicuna-13b](https://huggingface.co/liuhaotian/llava-v1.6-vicuna-13b) as LLM
@@ -29,6 +33,7 @@ Here's the prompt template for this model:
29
  ```
30
  "A chat between a curious human and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the human's questions. USER: <image>\nWhat is shown in this image? ASSISTANT:"
31
  ```
 
32
  You can load and use the model like following:
33
  ```python
34
  from transformers import LlavaNextProcessor, LlavaNextForConditionalGeneration
@@ -44,7 +49,20 @@ model.to("cuda:0")
44
  # prepare image and text prompt, using the appropriate prompt template
45
  url = "https://github.com/haotian-liu/LLaVA/blob/1a91fc274d7c35a9b50b3cb29c4247ae5837ce39/images/llava_v1_5_radar.jpg?raw=true"
46
  image = Image.open(requests.get(url, stream=True).raw)
47
- prompt = "A chat between a curious human and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the human's questions. USER: <image>\nWhat is shown in this image? ASSISTANT:"
 
 
 
 
 
 
 
 
 
 
 
 
 
48
 
49
  inputs = processor(prompt, image, return_tensors="pt").to("cuda:0")
50
 
 
2
  tags:
3
  - vision
4
  - image-text-to-text
5
+ license: llama2
6
+ language:
7
+ - en
8
+ pipeline_tag: image-text-to-text
9
  ---
10
 
11
  # LLaVa-Next, leveraging [liuhaotian/llava-v1.6-vicuna-13b](https://huggingface.co/liuhaotian/llava-v1.6-vicuna-13b) as LLM
 
33
  ```
34
  "A chat between a curious human and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the human's questions. USER: <image>\nWhat is shown in this image? ASSISTANT:"
35
  ```
36
+
37
  You can load and use the model like following:
38
  ```python
39
  from transformers import LlavaNextProcessor, LlavaNextForConditionalGeneration
 
49
  # prepare image and text prompt, using the appropriate prompt template
50
  url = "https://github.com/haotian-liu/LLaVA/blob/1a91fc274d7c35a9b50b3cb29c4247ae5837ce39/images/llava_v1_5_radar.jpg?raw=true"
51
  image = Image.open(requests.get(url, stream=True).raw)
52
+
53
+ # Define a chat histiry and use `apply_chat_template` to get correctly formatted prompt
54
+ # Each value in "content" has to be a list of dicts with types ("text", "image")
55
+ conversation = [
56
+ {
57
+
58
+ "role": "user",
59
+ "content": [
60
+ {"type": "text", "text": "What is shown in this image?"},
61
+ {"type": "image"},
62
+ ],
63
+ },
64
+ ]
65
+ prompt = processor.apply_chat_template(conversation, add_generation_prompt=True)
66
 
67
  inputs = processor(prompt, image, return_tensors="pt").to("cuda:0")
68