TheBloke commited on
Commit
869942f
1 Parent(s): 5189b0c

Re-upload of GPTQ model due to issue with base model

Browse files
Files changed (1) hide show
  1. README.md +18 -15
README.md CHANGED
@@ -21,7 +21,7 @@ license: other
21
 
22
  These files are GPTQ 4bit model files for [Camel AI's CAMEL 13B Combined Data](https://huggingface.co/camel-ai/CAMEL-13B-Combined-Data).
23
 
24
- It is the result of quantising to 4bit using [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa).
25
 
26
  ## Repositories available
27
 
@@ -36,10 +36,13 @@ Please make sure you're using the latest version of text-generation-webui
36
  1. Click the **Model tab**.
37
  2. Under **Download custom model or LoRA**, enter `TheBloke/CAMEL-13B-Combined-Data-GPTQ`.
38
  3. Click **Download**.
39
- 4. The model will start downloading, and once finished it will be automatically loaded.
40
- 5. If you want any custom settings, set them and then click **Save settings for this model** followed by **Reload the Model** in the top right.
 
 
 
41
  * Note that you do not need to set GPTQ parameters any more. These are set automatically from the file `quantize_config.json`.
42
- 6. Once you're ready, click the **Text Generation tab** and enter a prompt to get started!
43
 
44
  ## How to use this GPTQ model from Python code
45
 
@@ -55,7 +58,7 @@ from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig
55
  import argparse
56
 
57
  model_name_or_path = "TheBloke/CAMEL-13B-Combined-Data-GPTQ"
58
- model_basename = "camel-30b-combined-GPTQ-4bit--1g.act.order"
59
 
60
  use_triton = False
61
 
@@ -64,11 +67,15 @@ tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, use_fast=True)
64
  model = AutoGPTQForCausalLM.from_quantized(model_name_or_path,
65
  model_basename=model_basename,
66
  use_safetensors=True,
67
- trust_remote_code=True,
68
  device="cuda:0",
69
  use_triton=use_triton,
70
  quantize_config=None)
71
 
 
 
 
 
72
  print("\n\n*** Generate:")
73
 
74
  input_ids = tokenizer(prompt_template, return_tensors='pt').input_ids.cuda()
@@ -80,10 +87,6 @@ print(tokenizer.decode(output[0]))
80
  # Prevent printing spurious transformers error when using pipeline with AutoGPTQ
81
  logging.set_verbosity(logging.CRITICAL)
82
 
83
- prompt = "Tell me about AI"
84
- prompt_template=f'''### Human: {prompt}
85
- ### Assistant:'''
86
-
87
  print("*** Pipeline:")
88
  pipe = pipeline(
89
  "text-generation",
@@ -100,17 +103,17 @@ print(pipe(prompt_template)[0]['generated_text'])
100
 
101
  ## Provided files
102
 
103
- **camel-30b-combined-GPTQ-4bit--1g.act.order.safetensors**
104
 
105
  This will work with AutoGPTQ and CUDA versions of GPTQ-for-LLaMa. There are reports of issues with Triton mode of recent GPTQ-for-LLaMa. If you have issues, please use AutoGPTQ instead.
106
 
107
- It was created without group_size to lower VRAM requirements, and with --act-order (desc_act) to boost inference accuracy as much as possible.
108
 
109
- * `camel-30b-combined-GPTQ-4bit--1g.act.order.safetensors`
110
  * Works with AutoGPTQ in CUDA or Triton modes.
111
  * Works with GPTQ-for-LLaMa in CUDA mode. May have issues with GPTQ-for-LLaMa Triton mode.
112
  * Works with text-generation-webui, including one-click-installers.
113
- * Parameters: Groupsize = -1. Act Order / desc_act = True.
114
 
115
  <!-- footer start -->
116
  ## Discord
@@ -134,7 +137,7 @@ Donaters will get priority support on any and all AI/LLM/model questions and req
134
 
135
  **Special thanks to**: Luke from CarbonQuill, Aemon Algiz, Dmitriy Samsonov.
136
 
137
- **Patreon special mentions**: Ajan Kanaga, Kalila, Derek Yates, Sean Connelly, Luke, Nathan LeClaire, Trenton Dambrowitz, Mano Prime, David Flickinger, vamX, Nikolai Manek, senxiiz, Khalefa Al-Ahmad, Illia Dulskyi, trip7s trip, Jonathan Leane, Talal Aujan, Artur Olbinski, Cory Kujawski, Joseph William Delisle, Pyrater, Oscar Rangel, Lone Striker, Luke Pendergrass, Eugene Pentland, Johann-Peter Hartmann.
138
 
139
  Thank you to all my generous patrons and donaters!
140
 
 
21
 
22
  These files are GPTQ 4bit model files for [Camel AI's CAMEL 13B Combined Data](https://huggingface.co/camel-ai/CAMEL-13B-Combined-Data).
23
 
24
+ It is the result of quantising to 4bit using [AutoGPTQ](https://github.com/PanQiWei/AutoGPTQ).
25
 
26
  ## Repositories available
27
 
 
36
  1. Click the **Model tab**.
37
  2. Under **Download custom model or LoRA**, enter `TheBloke/CAMEL-13B-Combined-Data-GPTQ`.
38
  3. Click **Download**.
39
+ 4. The model will start downloading. Once it's finished it will say "Done"
40
+ 5. In the top left, click the refresh icon next to **Model**.
41
+ 6. In the **Model** dropdown, choose the model you just downloaded: `CAMEL-13B-Combined-Data-GPTQ`
42
+ 7. The model will automatically load, and is now ready for use!
43
+ 8. If you want any custom settings, set them and then click **Save settings for this model** followed by **Reload the Model** in the top right.
44
  * Note that you do not need to set GPTQ parameters any more. These are set automatically from the file `quantize_config.json`.
45
+ 9. Once you're ready, click the **Text Generation tab** and enter a prompt to get started!
46
 
47
  ## How to use this GPTQ model from Python code
48
 
 
58
  import argparse
59
 
60
  model_name_or_path = "TheBloke/CAMEL-13B-Combined-Data-GPTQ"
61
+ model_basename = "camel-13b-combined-GPTQ-4bit-128g.no-act.order"
62
 
63
  use_triton = False
64
 
 
67
  model = AutoGPTQForCausalLM.from_quantized(model_name_or_path,
68
  model_basename=model_basename,
69
  use_safetensors=True,
70
+ trust_remote_code=False,
71
  device="cuda:0",
72
  use_triton=use_triton,
73
  quantize_config=None)
74
 
75
+ prompt = "Tell me about AI"
76
+ prompt_template=f'''### Human: {prompt}
77
+ ### Assistant:'''
78
+
79
  print("\n\n*** Generate:")
80
 
81
  input_ids = tokenizer(prompt_template, return_tensors='pt').input_ids.cuda()
 
87
  # Prevent printing spurious transformers error when using pipeline with AutoGPTQ
88
  logging.set_verbosity(logging.CRITICAL)
89
 
 
 
 
 
90
  print("*** Pipeline:")
91
  pipe = pipeline(
92
  "text-generation",
 
103
 
104
  ## Provided files
105
 
106
+ **camel-13b-combined-GPTQ-4bit-128g.no-act.order.safetensors**
107
 
108
  This will work with AutoGPTQ and CUDA versions of GPTQ-for-LLaMa. There are reports of issues with Triton mode of recent GPTQ-for-LLaMa. If you have issues, please use AutoGPTQ instead.
109
 
110
+ It was created with group_size 128 to increase inference accuracy, but without --act-order (desc_act) to increase compatibility and improve inference speed.
111
 
112
+ * `camel-13b-combined-GPTQ-4bit-128g.no-act.order.safetensors`
113
  * Works with AutoGPTQ in CUDA or Triton modes.
114
  * Works with GPTQ-for-LLaMa in CUDA mode. May have issues with GPTQ-for-LLaMa Triton mode.
115
  * Works with text-generation-webui, including one-click-installers.
116
+ * Parameters: Groupsize = 128. Act Order / desc_act = False.
117
 
118
  <!-- footer start -->
119
  ## Discord
 
137
 
138
  **Special thanks to**: Luke from CarbonQuill, Aemon Algiz, Dmitriy Samsonov.
139
 
140
+ **Patreon special mentions**: Oscar Rangel, Eugene Pentland, Talal Aujan, Cory Kujawski, Luke, Asp the Wyvern, Ai Maven, Pyrater, Alps Aficionado, senxiiz, Willem Michiel, Junyu Yang, trip7s trip, Sebastain Graf, Joseph William Delisle, Lone Striker, Jonathan Leane, Johann-Peter Hartmann, David Flickinger, Spiking Neurons AB, Kevin Schuppel, Mano Prime, Dmitriy Samsonov, Sean Connelly, Nathan LeClaire, Alain Rossmann, Fen Risland, Derek Yates, Luke Pendergrass, Nikolai Manek, Khalefa Al-Ahmad, Artur Olbinski, John Detwiler, Ajan Kanaga, Imad Khwaja, Trenton Dambrowitz, Kalila, vamX, webtim, Illia Dulskyi.
141
 
142
  Thank you to all my generous patrons and donaters!
143