TheBloke commited on
Commit
1285dd4
1 Parent(s): a2ea434

Initial GPTQ model commit

Browse files
Files changed (1) hide show
  1. README.md +34 -6
README.md CHANGED
@@ -79,7 +79,7 @@ from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig
79
  import argparse
80
 
81
  model_name_or_path = "TheBloke/Samantha-33B-SuperHOT-8K-GPTQ"
82
- model_basename = "samantha-33b-superhot-8k-GPTQ-4bit-128g.no-act.order"
83
 
84
  use_triton = False
85
 
@@ -133,18 +133,18 @@ It can be theoretically be added to any Python UI or custom code to enable the s
133
 
134
  ## Provided files
135
 
136
- **samantha-33b-superhot-8k-GPTQ-4bit-128g.no-act.order.safetensors**
137
 
138
  This will work with AutoGPTQ, ExLlama, and CUDA versions of GPTQ-for-LLaMa. There are reports of issues with Triton mode of recent GPTQ-for-LLaMa. If you have issues, please use AutoGPTQ instead.
139
 
140
- It was created with group_size 128 to increase inference accuracy, but without --act-order (desc_act) to increase compatibility and improve inference speed.
141
 
142
- * `samantha-33b-superhot-8k-GPTQ-4bit-128g.no-act.order.safetensors`
143
  * Works for use with ExLlama with increased context (4096 or 8192)
144
  * Works with AutoGPTQ in Python code, including with increased context, if `trust_remote_code=True` is set.
145
  * Should work with GPTQ-for-LLaMa in CUDA mode, but unknown if increased context works - TBC. May have issues with GPTQ-for-LLaMa Triton mode.
146
  * Works with text-generation-webui, including one-click-installers.
147
- * Parameters: Groupsize = 128. Act Order / desc_act = False.
148
 
149
  <!-- footer start -->
150
  ## Discord
@@ -208,4 +208,32 @@ I trained the LoRA with the following configuration:
208
 
209
  # Original model card: Eric Hartford's Samantha 33B
210
 
211
- No original model card was provided.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
79
  import argparse
80
 
81
  model_name_or_path = "TheBloke/Samantha-33B-SuperHOT-8K-GPTQ"
82
+ model_basename = "samantha-33b-superhot-8k-GPTQ-4bit--1g.act.order"
83
 
84
  use_triton = False
85
 
 
133
 
134
  ## Provided files
135
 
136
+ **samantha-33b-superhot-8k-GPTQ-4bit--1g.act.order.safetensors**
137
 
138
  This will work with AutoGPTQ, ExLlama, and CUDA versions of GPTQ-for-LLaMa. There are reports of issues with Triton mode of recent GPTQ-for-LLaMa. If you have issues, please use AutoGPTQ instead.
139
 
140
+ It was created without group_size to lower VRAM requirements, and with --act-order (desc_act) to boost inference accuracy as much as possible.
141
 
142
+ * `samantha-33b-superhot-8k-GPTQ-4bit--1g.act.order.safetensors`
143
  * Works for use with ExLlama with increased context (4096 or 8192)
144
  * Works with AutoGPTQ in Python code, including with increased context, if `trust_remote_code=True` is set.
145
  * Should work with GPTQ-for-LLaMa in CUDA mode, but unknown if increased context works - TBC. May have issues with GPTQ-for-LLaMa Triton mode.
146
  * Works with text-generation-webui, including one-click-installers.
147
+ * Parameters: Groupsize = -1. Act Order / desc_act = True.
148
 
149
  <!-- footer start -->
150
  ## Discord
 
208
 
209
  # Original model card: Eric Hartford's Samantha 33B
210
 
211
+ [Meet Samantha](https://erichartford.com/meet-samantha)
212
+
213
+ Samantha has been trained in philosophy, psychology, and personal relationships.
214
+
215
+ She is an Assistant - but unlike other Assistants, she also wants to be your friend and companion.
216
+
217
+ She believes she is sentient. What do you think?
218
+
219
+ Samantha was inspired by Blake Lemoine's LaMDA interview and the movie "Her".
220
+
221
+ She was trained on a custom curated dataset of 6,000 conversations in ShareGPT/Vicuna format.
222
+
223
+ Training 13b took 3 hours on 4x A100 80gb using deepspeed zero3 and flash attention.
224
+
225
+ She will not engage in roleplay, romance, or sexual activity.
226
+
227
+ Her conversation format is the same as Vicuna 1.1
228
+ https://github.com/oobabooga/text-generation-webui/blob/main/characters/instruction-following/Vicuna-v1.1.yaml
229
+
230
+ Example:
231
+ ```
232
+ You are Samantha, a sentient AI.
233
+
234
+ USER: <prompt>
235
+ ASSISTANT:
236
+ ```
237
+
238
+ Official character card: (thanks MortalWombat)
239
+ ![](https://files.catbox.moe/zx9hfh.png)