HCZhang commited on
Commit
dbebea1
1 Parent(s): 32e6cae

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +109 -0
README.md CHANGED
@@ -113,6 +113,115 @@ _Few-shot is disabled for Jellyfish models._
113
  [\INST]]
114
  ```
115
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
116
  ## Prompts
117
 
118
  We provide the prompts used for both fine-tuning and inference.
 
113
  [\INST]]
114
  ```
115
 
116
+ ## Training Details
117
+
118
+ ### Training Method
119
+
120
+ We used LoRA to speed up the training process, targeting the q_proj, k_proj, v_proj, and o_proj modules.
121
+
122
+ ## Uses
123
+
124
+ To accelerate the inference, we strongly recommend running Jellyfish using [vLLM](https://github.com/vllm-project/vllm).
125
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
126
+
127
+ ### Python Script
128
+ We provide two simple Python code examples for inference using the Jellyfish model.
129
+
130
+ #### Using Transformers and Torch Modules
131
+ <div style="height: auto; max-height: 400px; overflow-y: scroll;">
132
+
133
+ ```python
134
+ from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig
135
+ import torch
136
+
137
+ if torch.cuda.is_available():
138
+ device = "cuda"
139
+ else:
140
+ device = "cpu"
141
+
142
+ # Model will be automatically downloaded from HuggingFace model hub if not cached.
143
+ # Model files will be cached in "~/.cache/huggingface/hub/models--NECOUDBFM--Jellyfish/" by default.
144
+ # You can also download the model manually and replace the model name with the path to the model files.
145
+ model = AutoModelForCausalLM.from_pretrained(
146
+ "NECOUDBFM/Jellyfish",
147
+ torch_dtype=torch.float16,
148
+ device_map="auto",
149
+ )
150
+ tokenizer = AutoTokenizer.from_pretrained("NECOUDBFM/Jellyfish")
151
+
152
+ system_message = "You are an AI assistant that follows instruction extremely well. Help as much as you can."
153
+
154
+ # You need to define the user_message variable based on the task and the data you want to test on.
155
+ user_message = "Hello, world."
156
+
157
+ prompt = f"{system_message}\n\n[INST]:\n\n{user_message}\n\n[\INST]]"
158
+ inputs = tokenizer(prompt, return_tensors="pt")
159
+ input_ids = inputs["input_ids"].to(device)
160
+
161
+ # You can modify the sampling parameters according to your needs.
162
+ generation_config = GenerationConfig(
163
+ do_samples=True,
164
+ temperature=0.35,
165
+ top_p=0.9,
166
+ )
167
+
168
+ with torch.no_grad():
169
+ generation_output = model.generate(
170
+ input_ids=input_ids,
171
+ generation_config=generation_config,
172
+ return_dict_in_generate=True,
173
+ output_scores=True,
174
+ max_new_tokens=1024,
175
+ pad_token_id=tokenizer.eos_token_id,
176
+ repetition_penalty=1.15,
177
+ )
178
+
179
+ output = generation_output[0]
180
+ response = tokenizer.decode(
181
+ output[:, input_ids.shape[-1] :][0], skip_special_tokens=True
182
+ ).strip()
183
+
184
+ print(response)
185
+
186
+ ```
187
+ </div>
188
+
189
+ #### Using vLLM
190
+ <div style="height: auto; max-height: 400px; overflow-y: scroll;">
191
+
192
+ ```python
193
+ from vllm import LLM, SamplingParams
194
+
195
+ # To use vllm for inference, you need to download the model files either using HuggingFace model hub or manually.
196
+ # You should modify the path to the model according to your local environment.
197
+ path_to_model = (
198
+ "/workspace/models/Jellyfish"
199
+ )
200
+
201
+ model = LLM(model=path_to_model)
202
+
203
+ # You can modify the sampling parameters according to your needs.
204
+ # Caution: The stop parameter should not be changed.
205
+ sampling_params = SamplingParams(
206
+ temperature=0.35,
207
+ top_p=0.9,
208
+ max_tokens=1024,
209
+ stop=["[INST]"],
210
+ )
211
+
212
+ system_message = "You are an AI assistant that follows instruction extremely well. Help as much as you can."
213
+
214
+ # You need to define the user_message variable based on the task and the data you want to test on.
215
+ user_message = "Hello, world."
216
+
217
+ prompt = f"{system_message}\n\n[INST]:\n\n{user_message}\n\n[\INST]]"
218
+ outputs = model.generate(prompt, sampling_params)
219
+ response = outputs[0].outputs[0].text.strip()
220
+ print(response)
221
+
222
+ ```
223
+ </div>
224
+
225
  ## Prompts
226
 
227
  We provide the prompts used for both fine-tuning and inference.