Louisnguyen commited on
Commit
1481181
·
verified ·
1 Parent(s): 3585a9e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +84 -3
README.md CHANGED
@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
16
 
17
  # llava-1.6-7b-hf-final
18
 
19
- This model is a fine-tuned version of [llava-hf/llava-v1.6-mistral-7b-hf](https://huggingface.co/llava-hf/llava-v1.6-mistral-7b-hf) on an unknown dataset.
20
 
21
  ## Model description
22
 
@@ -32,6 +32,87 @@ More information needed
32
 
33
  ## Training procedure
34
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
35
  ### Training hyperparameters
36
 
37
  The following hyperparameters were used during training:
@@ -46,7 +127,7 @@ The following hyperparameters were used during training:
46
 
47
  ### Training results
48
 
49
-
50
 
51
  ### Framework versions
52
 
@@ -54,4 +135,4 @@ The following hyperparameters were used during training:
54
  - Transformers 4.43.3
55
  - Pytorch 2.2.1+cu121
56
  - Datasets 2.20.0
57
- - Tokenizers 0.19.1
 
16
 
17
  # llava-1.6-7b-hf-final
18
 
19
+ This model is a fine-tuned version of [llava-hf/llava-v1.6-mistral-7b-hf](https://huggingface.co/llava-hf/llava-v1.6-mistral-7b-hf) on an derek-thomas/ScienceQA.
20
 
21
  ## Model description
22
 
 
32
 
33
  ## Training procedure
34
 
35
+ ## Chat Template
36
+ ```python
37
+ CHAT_TEMPLATE='''<<SYS>>
38
+ A chat between an user and an artificial intelligence assistant about Science Question Answering. The assistant gives helpful, detailed, and polite answers to the user's questions.
39
+ Based on the image, question and hint, please choose one of the given choices that answer the question.
40
+ Give yourself room to think by extracting the image, question and hint before choosing the choice.
41
+ Don't return the thinking, only return the highest accuracy choice.
42
+ Make sure your answers are as correct as possible.
43
+ <</SYS>>
44
+ {% for tag, content in messages.items() %}
45
+ {% if tag == 'real_question' %}
46
+ Now use the following image and question to choose the choice:
47
+ {% for message in content %}
48
+ {% if message['role'] == 'user' %}[INST] USER: {% else %}ASSISTANT: {% endif %}
49
+ {% for item in message['content'] %}
50
+ {% if item['type'] == 'text_question' %}
51
+ Question: {{ item['question'] }}
52
+ {% elif item['type'] == 'text_hint' %}
53
+ Hint: {{ item['hint'] }}
54
+ {% elif item['type'] == 'text_choice' %}
55
+ Choices: {{ item['choice'] }} [/INST]
56
+ {% elif item['type'] == 'text_solution' %}
57
+ Solution: {{ item['solution'] }}
58
+ {% elif item['type'] == 'text_answer' %}
59
+ Answer: {{ item['answer'] }}{% elif item['type'] == 'image' %}<image>
60
+ {% endif %}
61
+ {% endfor %}
62
+ {% if message['role'] == 'user' %}
63
+ {% else %}
64
+ {{eos_token}}
65
+ {% endif %}{% endfor %}{% endif %}
66
+ {% endfor %}'''
67
+ ```
68
+
69
+ ## How to use
70
+ ```python
71
+ from transformers import LlavaNextProcessor, LlavaNextForConditionalGeneration
72
+ import torch
73
+ from PIL import Image
74
+ import requests
75
+
76
+
77
+ model_id = "Louisnguyen/llava-1.6-7b-hf-final"
78
+ quantization_config = BitsAndBytesConfig(
79
+ load_in_4bit=True,
80
+ )
81
+ model = LlavaNextForConditionalGeneration.from_pretrained(model_id,
82
+ quantization_config=quantization_config,
83
+ torch_dtype=torch.float16)
84
+ model.to("cuda:0")
85
+ processor = LlavaNextProcessor.from_pretrained(model_id)
86
+
87
+ image = example["image"]
88
+ question = example["question"]
89
+ choices = example["choices"]
90
+ hint = example["hint"]
91
+
92
+ messages_answer = {
93
+ "real_question": [
94
+ {
95
+ "role": "user",
96
+ "content": [
97
+ {"type": "image"},
98
+ {"type": "text_question", "question": question},
99
+ {"type": "text_hint", "hint": hint},
100
+ {"type": "text_choice", "choice": ' or '.join(choices)},
101
+ ]
102
+ }
103
+ ]
104
+ }
105
+ # Apply the chat template to format the messages for answer generation
106
+ text_answer = processor.tokenizer.apply_chat_template(messages_answer, tokenize=False, add_generation_prompt=True)
107
+ # Prepare the inputs for the model to generate the answer
108
+ inputs_answer = processor(text=[text_answer.strip()], images=image, return_tensors="pt", padding=True).to('cuda')
109
+ # Generate text using the model for the answer
110
+ generated_ids_answer = model.generate(**inputs_answer, max_new_tokens=1024, pad_token_id=tokenizer.eos_token_id)
111
+ # Decode the generated text for the answer
112
+ generated_texts_answer = processor.batch_decode(generated_ids_answer[:, inputs_answer["input_ids"].size(1):], skip_special_tokens=True)
113
+ print(generated_texts_answer)
114
+ ```
115
+
116
  ### Training hyperparameters
117
 
118
  The following hyperparameters were used during training:
 
127
 
128
  ### Training results
129
 
130
+ Accuracy ~80%
131
 
132
  ### Framework versions
133
 
 
135
  - Transformers 4.43.3
136
  - Pytorch 2.2.1+cu121
137
  - Datasets 2.20.0
138
+ - Tokenizers 0.19.1