TheBloke commited on
Commit
2c5e599
1 Parent(s): fab59e2

Update for Transformers GPTQ support

Browse files
README.md CHANGED
@@ -4,17 +4,20 @@ license: other
4
  ---
5
 
6
  <!-- header start -->
7
- <div style="width: 100%;">
8
- <img src="https://i.imgur.com/EBdldam.jpg" alt="TheBlokeAI" style="width: 100%; min-width: 400px; display: block; margin: auto;">
 
9
  </div>
10
  <div style="display: flex; justify-content: space-between; width: 100%;">
11
  <div style="display: flex; flex-direction: column; align-items: flex-start;">
12
- <p><a href="https://discord.gg/theblokeai">Chat & support: my new Discord server</a></p>
13
  </div>
14
  <div style="display: flex; flex-direction: column; align-items: flex-end;">
15
- <p><a href="https://www.patreon.com/TheBlokeAI">Want to contribute? TheBloke's Patreon page</a></p>
16
  </div>
17
  </div>
 
 
18
  <!-- header end -->
19
 
20
  # NousResearch's Nous-Hermes-13B GPTQ
@@ -145,6 +148,7 @@ It was created with group_size 128 to increase inference accuracy, but without -
145
  * Parameters: Groupsize = 128. Act Order / desc_act = False.
146
 
147
  <!-- footer start -->
 
148
  ## Discord
149
 
150
  For further support, and discussions on these models and AI in general, join us at:
@@ -164,12 +168,15 @@ Donaters will get priority support on any and all AI/LLM/model questions and req
164
  * Patreon: https://patreon.com/TheBlokeAI
165
  * Ko-Fi: https://ko-fi.com/TheBlokeAI
166
 
167
- **Special thanks to**: Luke from CarbonQuill, Aemon Algiz, Dmitriy Samsonov.
 
 
168
 
169
- **Patreon special mentions**: Pyrater, WelcomeToTheClub, Kalila, Mano Prime, Trenton Dambrowitz, Spiking Neurons AB, Pierre Kircher, Fen Risland, Kevin Schuppel, Luke, Rainer Wilmers, vamX, Gabriel Puliatti, Alex , Karl Bernard, Ajan Kanaga, Talal Aujan, Space Cruiser, ya boyyy, biorpg, Johann-Peter Hartmann, Asp the Wyvern, Ai Maven, Ghost , Preetika Verma, Nikolai Manek, trip7s trip, John Detwiler, Fred von Graf, Artur Olbinski, subjectnull, John Villwock, Junyu Yang, Rod A, Lone Striker, Chris McCloskey, Iucharbius , Matthew Berman, Illia Dulskyi, Khalefa Al-Ahmad, Imad Khwaja, chris gileta, Willem Michiel, Greatston Gnanesh, Derek Yates, K, Alps Aficionado, Oscar Rangel, David Flickinger, Luke Pendergrass, Deep Realms, Eugene Pentland, Cory Kujawski, terasurfer , Jonathan Leane, senxiiz, Joseph William Delisle, Sean Connelly, webtim, zynix , Nathan LeClaire.
170
 
171
  Thank you to all my generous patrons and donaters!
172
 
 
 
173
  <!-- footer end -->
174
 
175
  # Original model card: Kaio Ken's SuperHOT 8K
@@ -213,22 +220,22 @@ I trained the LoRA with the following configuration:
213
 
214
  Nous-Hermes-13b is a state-of-the-art language model fine-tuned on over 300,000 instructions. This model was fine-tuned by Nous Research, with Teknium and Karan4D leading the fine tuning process and dataset curation, Redmond AI sponsoring the compute, and several other contributors. The result is an enhanced Llama 13b model that rivals GPT-3.5-turbo in performance across a variety of tasks.
215
 
216
- This model stands out for its long responses, low hallucination rate, and absence of OpenAI censorship mechanisms. The fine-tuning process was performed with a 2000 sequence length on an 8x a100 80GB DGX machine for over 50 hours.
217
 
218
  ## Model Training
219
 
220
- The model was trained almost entirely on synthetic GPT-4 outputs. This includes data from diverse sources such as GPTeacher, the general, roleplay v1&2, code instruct datasets, Nous Instruct & PDACTL (unpublished), CodeAlpaca, Evol_Instruct Uncensored, GPT4-LLM, and Unnatural Instructions.
221
 
222
  Additional data inputs came from Camel-AI's Biology/Physics/Chemistry and Math Datasets, Airoboros' GPT-4 Dataset, and more from CodeAlpaca. The total volume of data encompassed over 300,000 instructions.
223
 
224
  ## Collaborators
225
- The model fine-tuning and the datasets were a collaboration of efforts and resources between Teknium, Karan4D, Nous Research, Huemin Art, and Redmond AI.
226
-
227
- Huge shoutout and acknowledgement is deserved for all the dataset creators who generously share their datasets openly.
228
 
229
  Special mention goes to @winglian, @erhartford, and @main_horse for assisting in some of the training issues.
230
 
231
- Among the contributors of datasets, GPTeacher was made available by Teknium, Wizard LM by nlpxucan, and the Nous Research Instruct Dataset was provided by Karan4D and HueminArt.
232
  The GPT4-LLM and Unnatural Instructions were provided by Microsoft, Airoboros dataset by jondurbin, Camel-AI datasets are from Camel-AI, and CodeAlpaca dataset by Sahil 2801.
233
  If anyone was left out, please open a thread in the community tab.
234
 
@@ -241,7 +248,7 @@ The model follows the Alpaca prompt format:
241
  ### Response:
242
  ```
243
 
244
- or
245
 
246
  ```
247
  ### Instruction:
@@ -249,11 +256,11 @@ or
249
  ### Input:
250
 
251
  ### Response:
252
- ```
253
 
254
  ## Resources for Applied Use Cases:
255
- For an example of a back and forth chatbot using huggingface transformers and discord, check out: https://github.com/teknium1/alpaca-discord
256
- For an example of a roleplaying discord bot, check out this: https://github.com/teknium1/alpaca-roleplay-discordbot
257
 
258
  ## Future Plans
259
  The model is currently being uploaded in FP16 format, and there are plans to convert the model to GGML and GPTQ 4bit quantizations. The team is also working on a full benchmark, similar to what was done for GPT4-x-Vicuna. We will try to get in discussions to get the model included in the GPT4All.
@@ -276,9 +283,9 @@ The model is currently being uploaded in FP16 format, and there are plans to con
276
  |winogrande | 0|acc |0.7190|± |0.0126|
277
  ```
278
 
279
- These benchmarks currently have us at #1 on ARC-c, ARC-e, Hellaswag, and OpenBookQA, and 2nd place on Winogrande, comparing to GPT4all's benchmarking list.
280
 
281
  ## Model Usage
282
  The model is available for download on Hugging Face. It is suitable for a wide range of language tasks, from generating creative text to understanding and following complex instructions.
283
-
284
  Compute provided by our project sponsor Redmond AI, thank you!!
 
4
  ---
5
 
6
  <!-- header start -->
7
+ <!-- 200823 -->
8
+ <div style="width: auto; margin-left: auto; margin-right: auto">
9
+ <img src="https://i.imgur.com/EBdldam.jpg" alt="TheBlokeAI" style="width: 100%; min-width: 400px; display: block; margin: auto;">
10
  </div>
11
  <div style="display: flex; justify-content: space-between; width: 100%;">
12
  <div style="display: flex; flex-direction: column; align-items: flex-start;">
13
+ <p style="margin-top: 0.5em; margin-bottom: 0em;"><a href="https://discord.gg/theblokeai">Chat & support: TheBloke's Discord server</a></p>
14
  </div>
15
  <div style="display: flex; flex-direction: column; align-items: flex-end;">
16
+ <p style="margin-top: 0.5em; margin-bottom: 0em;"><a href="https://www.patreon.com/TheBlokeAI">Want to contribute? TheBloke's Patreon page</a></p>
17
  </div>
18
  </div>
19
+ <div style="text-align:center; margin-top: 0em; margin-bottom: 0em"><p style="margin-top: 0.25em; margin-bottom: 0em;">TheBloke's LLM work is generously supported by a grant from <a href="https://a16z.com">andreessen horowitz (a16z)</a></p></div>
20
+ <hr style="margin-top: 1.0em; margin-bottom: 1.0em;">
21
  <!-- header end -->
22
 
23
  # NousResearch's Nous-Hermes-13B GPTQ
 
148
  * Parameters: Groupsize = 128. Act Order / desc_act = False.
149
 
150
  <!-- footer start -->
151
+ <!-- 200823 -->
152
  ## Discord
153
 
154
  For further support, and discussions on these models and AI in general, join us at:
 
168
  * Patreon: https://patreon.com/TheBlokeAI
169
  * Ko-Fi: https://ko-fi.com/TheBlokeAI
170
 
171
+ **Special thanks to**: Aemon Algiz.
172
+
173
+ **Patreon special mentions**: Sam, theTransient, Jonathan Leane, Steven Wood, webtim, Johann-Peter Hartmann, Geoffrey Montalvo, Gabriel Tamborski, Willem Michiel, John Villwock, Derek Yates, Mesiah Bishop, Eugene Pentland, Pieter, Chadd, Stephen Murray, Daniel P. Andersen, terasurfer, Brandon Frisco, Thomas Belote, Sid, Nathan LeClaire, Magnesian, Alps Aficionado, Stanislav Ovsiannikov, Alex, Joseph William Delisle, Nikolai Manek, Michael Davis, Junyu Yang, K, J, Spencer Kim, Stefan Sabev, Olusegun Samson, transmissions 11, Michael Levine, Cory Kujawski, Rainer Wilmers, zynix, Kalila, Luke @flexchar, Ajan Kanaga, Mandus, vamX, Ai Maven, Mano Prime, Matthew Berman, subjectnull, Vitor Caleffi, Clay Pascal, biorpg, alfie_i, 阿明, Jeffrey Morgan, ya boyyy, Raymond Fosdick, knownsqashed, Olakabola, Leonard Tan, ReadyPlayerEmma, Enrico Ros, Dave, Talal Aujan, Illia Dulskyi, Sean Connelly, senxiiz, Artur Olbinski, Elle, Raven Klaugh, Fen Risland, Deep Realms, Imad Khwaja, Fred von Graf, Will Dee, usrbinkat, SuperWojo, Alexandros Triantafyllidis, Swaroop Kallakuri, Dan Guido, John Detwiler, Pedro Madruga, Iucharbius, Viktor Bowallius, Asp the Wyvern, Edmond Seymore, Trenton Dambrowitz, Space Cruiser, Spiking Neurons AB, Pyrater, LangChain4j, Tony Hughes, Kacper Wikieł, Rishabh Srivastava, David Ziegler, Luke Pendergrass, Andrey, Gabriel Puliatti, Lone Striker, Sebastain Graf, Pierre Kircher, Randy H, NimbleBox.ai, Vadim, danny, Deo Leter
174
 
 
175
 
176
  Thank you to all my generous patrons and donaters!
177
 
178
+ And thank you again to a16z for their generous grant.
179
+
180
  <!-- footer end -->
181
 
182
  # Original model card: Kaio Ken's SuperHOT 8K
 
220
 
221
  Nous-Hermes-13b is a state-of-the-art language model fine-tuned on over 300,000 instructions. This model was fine-tuned by Nous Research, with Teknium and Karan4D leading the fine tuning process and dataset curation, Redmond AI sponsoring the compute, and several other contributors. The result is an enhanced Llama 13b model that rivals GPT-3.5-turbo in performance across a variety of tasks.
222
 
223
+ This model stands out for its long responses, low hallucination rate, and absence of OpenAI censorship mechanisms. The fine-tuning process was performed with a 2000 sequence length on an 8x a100 80GB DGX machine for over 50 hours.
224
 
225
  ## Model Training
226
 
227
+ The model was trained almost entirely on synthetic GPT-4 outputs. This includes data from diverse sources such as GPTeacher, the general, roleplay v1&2, code instruct datasets, Nous Instruct & PDACTL (unpublished), CodeAlpaca, Evol_Instruct Uncensored, GPT4-LLM, and Unnatural Instructions.
228
 
229
  Additional data inputs came from Camel-AI's Biology/Physics/Chemistry and Math Datasets, Airoboros' GPT-4 Dataset, and more from CodeAlpaca. The total volume of data encompassed over 300,000 instructions.
230
 
231
  ## Collaborators
232
+ The model fine-tuning and the datasets were a collaboration of efforts and resources between Teknium, Karan4D, Nous Research, Huemin Art, and Redmond AI.
233
+
234
+ Huge shoutout and acknowledgement is deserved for all the dataset creators who generously share their datasets openly.
235
 
236
  Special mention goes to @winglian, @erhartford, and @main_horse for assisting in some of the training issues.
237
 
238
+ Among the contributors of datasets, GPTeacher was made available by Teknium, Wizard LM by nlpxucan, and the Nous Research Instruct Dataset was provided by Karan4D and HueminArt.
239
  The GPT4-LLM and Unnatural Instructions were provided by Microsoft, Airoboros dataset by jondurbin, Camel-AI datasets are from Camel-AI, and CodeAlpaca dataset by Sahil 2801.
240
  If anyone was left out, please open a thread in the community tab.
241
 
 
248
  ### Response:
249
  ```
250
 
251
+ or
252
 
253
  ```
254
  ### Instruction:
 
256
  ### Input:
257
 
258
  ### Response:
259
+ ```
260
 
261
  ## Resources for Applied Use Cases:
262
+ For an example of a back and forth chatbot using huggingface transformers and discord, check out: https://github.com/teknium1/alpaca-discord
263
+ For an example of a roleplaying discord bot, check out this: https://github.com/teknium1/alpaca-roleplay-discordbot
264
 
265
  ## Future Plans
266
  The model is currently being uploaded in FP16 format, and there are plans to convert the model to GGML and GPTQ 4bit quantizations. The team is also working on a full benchmark, similar to what was done for GPT4-x-Vicuna. We will try to get in discussions to get the model included in the GPT4All.
 
283
  |winogrande | 0|acc |0.7190|± |0.0126|
284
  ```
285
 
286
+ These benchmarks currently have us at #1 on ARC-c, ARC-e, Hellaswag, and OpenBookQA, and 2nd place on Winogrande, comparing to GPT4all's benchmarking list.
287
 
288
  ## Model Usage
289
  The model is available for download on Hugging Face. It is suitable for a wide range of language tasks, from generating creative text to understanding and following complex instructions.
290
+
291
  Compute provided by our project sponsor Redmond AI, thank you!!
config.json CHANGED
@@ -1,28 +1,38 @@
1
  {
2
- "_name_or_path": "/workspace/process/nous-nermes-13b/source",
3
- "architectures": [
4
- "LlamaForCausalLM"
5
- ],
6
- "bos_token_id": 1,
7
- "eos_token_id": 2,
8
- "hidden_act": "silu",
9
- "hidden_size": 5120,
10
- "initializer_range": 0.02,
11
- "intermediate_size": 13824,
12
- "max_position_embeddings": 8192,
13
- "model_type": "llama",
14
- "num_attention_heads": 40,
15
- "num_hidden_layers": 40,
16
- "pad_token_id": 0,
17
- "rms_norm_eps": 1e-06,
18
- "tie_word_embeddings": false,
19
- "torch_dtype": "float16",
20
- "transformers_version": "4.30.0.dev0",
21
- "use_cache": true,
22
- "vocab_size": 32001,
23
- "auto_map": {
24
- "AutoModel": "modelling_llama.LlamaModel",
25
- "AutoModelForCausalLM": "modelling_llama.LlamaForCausalLM",
26
- "AutoModelForSequenceClassification": "modelling_llama.LlamaForSequenceClassification"
27
- }
 
 
 
 
 
 
 
 
 
 
28
  }
 
1
  {
2
+ "_name_or_path": "/workspace/process/nous-nermes-13b/source",
3
+ "architectures": [
4
+ "LlamaForCausalLM"
5
+ ],
6
+ "bos_token_id": 1,
7
+ "eos_token_id": 2,
8
+ "hidden_act": "silu",
9
+ "hidden_size": 5120,
10
+ "initializer_range": 0.02,
11
+ "intermediate_size": 13824,
12
+ "max_position_embeddings": 8192,
13
+ "model_type": "llama",
14
+ "num_attention_heads": 40,
15
+ "num_hidden_layers": 40,
16
+ "pad_token_id": 0,
17
+ "rms_norm_eps": 1e-06,
18
+ "tie_word_embeddings": false,
19
+ "torch_dtype": "float16",
20
+ "transformers_version": "4.30.0.dev0",
21
+ "use_cache": true,
22
+ "vocab_size": 32001,
23
+ "auto_map": {
24
+ "AutoModel": "modelling_llama.LlamaModel",
25
+ "AutoModelForCausalLM": "modelling_llama.LlamaForCausalLM",
26
+ "AutoModelForSequenceClassification": "modelling_llama.LlamaForSequenceClassification"
27
+ },
28
+ "quantization_config": {
29
+ "bits": 4,
30
+ "group_size": 128,
31
+ "damp_percent": 0.01,
32
+ "desc_act": false,
33
+ "sym": true,
34
+ "true_sequential": true,
35
+ "model_file_base_name": "model",
36
+ "quant_method": "gptq"
37
+ }
38
  }
nous-hermes-13b-superhot-8k-GPTQ-4bit-128g.no-act.order.safetensors → model.safetensors RENAMED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:96924ed42650c2c72ac31e72099e942d915174da5147c373ba0179c20a3c0f4d
3
- size 7454817640
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3680e3c42883043323e8fb36cf69a7f19554211bcb54218eeb73d6afa3427f84
3
+ size 7454817696
quantize_config.json CHANGED
@@ -1,8 +1,9 @@
1
  {
2
- "bits": 4,
3
- "group_size": 128,
4
- "damp_percent": 0.01,
5
- "desc_act": false,
6
- "sym": true,
7
- "true_sequential": true
 
8
  }
 
1
  {
2
+ "bits": 4,
3
+ "group_size": 128,
4
+ "damp_percent": 0.01,
5
+ "desc_act": false,
6
+ "sym": true,
7
+ "true_sequential": true,
8
+ "model_file_base_name": "model"
9
  }