TheBloke commited on
Commit
4e14067
1 Parent(s): 7ba4217

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -62
README.md CHANGED
@@ -11,9 +11,7 @@ license: apache-2.0
11
  model_creator: Mistral AI_
12
  model_name: Mixtral 8X7B Instruct v0.1
13
  model_type: mixtral
14
- prompt_template: '<s>[INST] {prompt} [/INST]
15
-
16
- '
17
  quantized_by: TheBloke
18
  ---
19
  <!-- markdownlint-disable MD041 -->
@@ -68,7 +66,7 @@ Multiple GPTQ parameter permutations are provided; see Provided Files below for
68
  ## Prompt template: Mistral
69
 
70
  ```
71
- <s>[INST] {prompt} [/INST]
72
  ```
73
  <!-- prompt-template end -->
74
 
@@ -201,64 +199,6 @@ It is strongly recommended to use the text-generation-webui one-click-installers
201
 
202
  <!-- README_GPTQ.md-text-generation-webui end -->
203
 
204
- <!-- README_GPTQ.md-use-from-python start -->
205
- ## Python code example: inference from this GPTQ model
206
-
207
- ### Install the necessary packages
208
-
209
- Requires: Transformers 4.33.0 or later, Optimum 1.12.0 or later, and AutoGPTQ 0.4.2 or later.
210
-
211
- ```shell
212
- pip3 install --upgrade transformers optimum
213
- # If using PyTorch 2.1 + CUDA 12.x:
214
- pip3 install --upgrade auto-gptq
215
- # or, if using PyTorch 2.1 + CUDA 11.x:
216
- pip3 install --upgrade auto-gptq --extra-index-url https://huggingface.github.io/autogptq-index/whl/cu118/
217
- ```
218
-
219
- If you are using PyTorch 2.0, you will need to install AutoGPTQ from source. Likewise if you have problems with the pre-built wheels, you should try building from source:
220
-
221
- ```shell
222
- pip3 uninstall -y auto-gptq
223
- git clone https://github.com/PanQiWei/AutoGPTQ
224
- cd AutoGPTQ
225
- git checkout v0.5.1
226
- pip3 install .
227
- ```
228
-
229
- ### Example Python code
230
-
231
- ```python
232
- model_name_or_path = "TheBloke/Mixtral-8x7B-Instruct-v0.1-GPTQ"
233
- from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline, GPTQConfig
234
- from auto_gptq import AutoGPTQForCausalLM
235
-
236
- model_name_or_path = args.model_dir
237
- # To use a different branch, change revision
238
- # For example: revision="gptq-4bit-32g-actorder_True"
239
- model = AutoGPTQForCausalLM.from_quantized(model_name_or_path,
240
- model_basename="model",
241
- use_safetensors=True,
242
- trust_remote_code=False,
243
- device="cuda:0",
244
- use_triton=False,
245
- disable_exllama=False,
246
- disable_exllamav2=True,
247
- quantize_config=None)
248
-
249
- tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, use_fast=True, trust_remote_code=False)
250
-
251
- prompt = "Tell me about AI"
252
- prompt_template=f'''<s>[INST] {prompt} [/INST]
253
- '''
254
-
255
- print("\n\n*** Generate:")
256
-
257
- input_ids = tokenizer(prompt_template, return_tensors='pt').input_ids.cuda()
258
- output = model.generate(inputs=input_ids, temperature=0.7, do_sample=True, top_p=0.95, top_k=40, max_new_tokens=512)
259
- print(tokenizer.decode(output[0]))
260
- ```
261
- <!-- README_GPTQ.md-use-from-python end -->
262
 
263
  <!-- footer start -->
264
  <!-- 200823 -->
 
11
  model_creator: Mistral AI_
12
  model_name: Mixtral 8X7B Instruct v0.1
13
  model_type: mixtral
14
+ prompt_template: '[INST] {prompt} [/INST] '
 
 
15
  quantized_by: TheBloke
16
  ---
17
  <!-- markdownlint-disable MD041 -->
 
66
  ## Prompt template: Mistral
67
 
68
  ```
69
+ [INST] {prompt} [/INST]
70
  ```
71
  <!-- prompt-template end -->
72
 
 
199
 
200
  <!-- README_GPTQ.md-text-generation-webui end -->
201
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
202
 
203
  <!-- footer start -->
204
  <!-- 200823 -->