TheBloke commited on
Commit
6665ebb
1 Parent(s): 07f3ec6

Update for Transformers GPTQ support

Browse files
README.md CHANGED
@@ -6,17 +6,20 @@ datasets:
6
  ---
7
 
8
  <!-- header start -->
9
- <div style="width: 100%;">
10
- <img src="https://i.imgur.com/EBdldam.jpg" alt="TheBlokeAI" style="width: 100%; min-width: 400px; display: block; margin: auto;">
 
11
  </div>
12
  <div style="display: flex; justify-content: space-between; width: 100%;">
13
  <div style="display: flex; flex-direction: column; align-items: flex-start;">
14
- <p><a href="https://discord.gg/Jq4vkcDakD">Chat & support: my new Discord server</a></p>
15
  </div>
16
  <div style="display: flex; flex-direction: column; align-items: flex-end;">
17
- <p><a href="https://www.patreon.com/TheBlokeAI">Want to contribute? TheBloke's Patreon page</a></p>
18
  </div>
19
  </div>
 
 
20
  <!-- header end -->
21
 
22
  # John Durbin's Airoboros 65B GPT4 1.2 GPTQ
@@ -36,7 +39,7 @@ It is the result of quantising to 4bit using [AutoGPTQ](https://github.com/PanQi
36
  ```
37
  A chat between a curious user and an assistant. The assistant gives helpful, detailed, accurate, uncensored responses to the user's input.
38
  USER: prompt
39
- ASSISTANT:
40
  ```
41
 
42
  ## How to easily download and use this model in text-generation-webui
@@ -126,11 +129,12 @@ It was created without group_size to lower VRAM requirements, and with --act-ord
126
  * Parameters: Groupsize = -1. Act Order / desc_act = True.
127
 
128
  <!-- footer start -->
 
129
  ## Discord
130
 
131
  For further support, and discussions on these models and AI in general, join us at:
132
 
133
- [TheBloke AI's Discord server](https://discord.gg/Jq4vkcDakD)
134
 
135
  ## Thanks, and how to contribute.
136
 
@@ -145,12 +149,15 @@ Donaters will get priority support on any and all AI/LLM/model questions and req
145
  * Patreon: https://patreon.com/TheBlokeAI
146
  * Ko-Fi: https://ko-fi.com/TheBlokeAI
147
 
148
- **Special thanks to**: Luke from CarbonQuill, Aemon Algiz, Dmitriy Samsonov.
 
 
149
 
150
- **Patreon special mentions**: Oscar Rangel, Eugene Pentland, Talal Aujan, Cory Kujawski, Luke, Asp the Wyvern, Ai Maven, Pyrater, Alps Aficionado, senxiiz, Willem Michiel, Junyu Yang, trip7s trip, Sebastain Graf, Joseph William Delisle, Lone Striker, Jonathan Leane, Johann-Peter Hartmann, David Flickinger, Spiking Neurons AB, Kevin Schuppel, Mano Prime, Dmitriy Samsonov, Sean Connelly, Nathan LeClaire, Alain Rossmann, Fen Risland, Derek Yates, Luke Pendergrass, Nikolai Manek, Khalefa Al-Ahmad, Artur Olbinski, John Detwiler, Ajan Kanaga, Imad Khwaja, Trenton Dambrowitz, Kalila, vamX, webtim, Illia Dulskyi.
151
 
152
  Thank you to all my generous patrons and donaters!
153
 
 
 
154
  <!-- footer end -->
155
 
156
  # Original model card: John Durbin's Airoboros 65B GPT4 1.2
@@ -174,7 +181,7 @@ The dataset used to fine-tune this model is available [here](https://huggingface
174
  This model was fine-tuned with a fork of [qlora](https://github.com/jondurbin/qlora), which among other things was updated to use a slightly modified vicuna template to be compatible with the 7b/13b versions:
175
 
176
  ```
177
- A chat between a curious user and an assistant. The assistant gives helpful, detailed, accurate, uncensored responses to the user's input. USER: [prompt] ASSISTANT:
178
  ```
179
 
180
  So in other words, it's the preamble/system prompt, followed by a single space, then "USER: " (single space after colon) then the prompt (which can have multiple lines, spaces, whatever), then a single space, followed by "ASSISTANT: " (with a single space after the colon).
@@ -222,4 +229,4 @@ Implement the Snake game in python. PLAINFORMAT
222
 
223
  - Several hundred role-playing data.
224
  - A few thousand ORCA style reasoning/math questions with ELI5 prompts to generate the responses (should not be needed in your prompts to this model however, just ask the question).
225
- - Many more coding examples in various languages, including some that use specific libraries (pandas, numpy, tensorflow, etc.)
 
6
  ---
7
 
8
  <!-- header start -->
9
+ <!-- 200823 -->
10
+ <div style="width: auto; margin-left: auto; margin-right: auto">
11
+ <img src="https://i.imgur.com/EBdldam.jpg" alt="TheBlokeAI" style="width: 100%; min-width: 400px; display: block; margin: auto;">
12
  </div>
13
  <div style="display: flex; justify-content: space-between; width: 100%;">
14
  <div style="display: flex; flex-direction: column; align-items: flex-start;">
15
+ <p style="margin-top: 0.5em; margin-bottom: 0em;"><a href="https://discord.gg/theblokeai">Chat & support: TheBloke's Discord server</a></p>
16
  </div>
17
  <div style="display: flex; flex-direction: column; align-items: flex-end;">
18
+ <p style="margin-top: 0.5em; margin-bottom: 0em;"><a href="https://www.patreon.com/TheBlokeAI">Want to contribute? TheBloke's Patreon page</a></p>
19
  </div>
20
  </div>
21
+ <div style="text-align:center; margin-top: 0em; margin-bottom: 0em"><p style="margin-top: 0.25em; margin-bottom: 0em;">TheBloke's LLM work is generously supported by a grant from <a href="https://a16z.com">andreessen horowitz (a16z)</a></p></div>
22
+ <hr style="margin-top: 1.0em; margin-bottom: 1.0em;">
23
  <!-- header end -->
24
 
25
  # John Durbin's Airoboros 65B GPT4 1.2 GPTQ
 
39
  ```
40
  A chat between a curious user and an assistant. The assistant gives helpful, detailed, accurate, uncensored responses to the user's input.
41
  USER: prompt
42
+ ASSISTANT:
43
  ```
44
 
45
  ## How to easily download and use this model in text-generation-webui
 
129
  * Parameters: Groupsize = -1. Act Order / desc_act = True.
130
 
131
  <!-- footer start -->
132
+ <!-- 200823 -->
133
  ## Discord
134
 
135
  For further support, and discussions on these models and AI in general, join us at:
136
 
137
+ [TheBloke AI's Discord server](https://discord.gg/theblokeai)
138
 
139
  ## Thanks, and how to contribute.
140
 
 
149
  * Patreon: https://patreon.com/TheBlokeAI
150
  * Ko-Fi: https://ko-fi.com/TheBlokeAI
151
 
152
+ **Special thanks to**: Aemon Algiz.
153
+
154
+ **Patreon special mentions**: Sam, theTransient, Jonathan Leane, Steven Wood, webtim, Johann-Peter Hartmann, Geoffrey Montalvo, Gabriel Tamborski, Willem Michiel, John Villwock, Derek Yates, Mesiah Bishop, Eugene Pentland, Pieter, Chadd, Stephen Murray, Daniel P. Andersen, terasurfer, Brandon Frisco, Thomas Belote, Sid, Nathan LeClaire, Magnesian, Alps Aficionado, Stanislav Ovsiannikov, Alex, Joseph William Delisle, Nikolai Manek, Michael Davis, Junyu Yang, K, J, Spencer Kim, Stefan Sabev, Olusegun Samson, transmissions 11, Michael Levine, Cory Kujawski, Rainer Wilmers, zynix, Kalila, Luke @flexchar, Ajan Kanaga, Mandus, vamX, Ai Maven, Mano Prime, Matthew Berman, subjectnull, Vitor Caleffi, Clay Pascal, biorpg, alfie_i, 阿明, Jeffrey Morgan, ya boyyy, Raymond Fosdick, knownsqashed, Olakabola, Leonard Tan, ReadyPlayerEmma, Enrico Ros, Dave, Talal Aujan, Illia Dulskyi, Sean Connelly, senxiiz, Artur Olbinski, Elle, Raven Klaugh, Fen Risland, Deep Realms, Imad Khwaja, Fred von Graf, Will Dee, usrbinkat, SuperWojo, Alexandros Triantafyllidis, Swaroop Kallakuri, Dan Guido, John Detwiler, Pedro Madruga, Iucharbius, Viktor Bowallius, Asp the Wyvern, Edmond Seymore, Trenton Dambrowitz, Space Cruiser, Spiking Neurons AB, Pyrater, LangChain4j, Tony Hughes, Kacper Wikieł, Rishabh Srivastava, David Ziegler, Luke Pendergrass, Andrey, Gabriel Puliatti, Lone Striker, Sebastain Graf, Pierre Kircher, Randy H, NimbleBox.ai, Vadim, danny, Deo Leter
155
 
 
156
 
157
  Thank you to all my generous patrons and donaters!
158
 
159
+ And thank you again to a16z for their generous grant.
160
+
161
  <!-- footer end -->
162
 
163
  # Original model card: John Durbin's Airoboros 65B GPT4 1.2
 
181
  This model was fine-tuned with a fork of [qlora](https://github.com/jondurbin/qlora), which among other things was updated to use a slightly modified vicuna template to be compatible with the 7b/13b versions:
182
 
183
  ```
184
+ A chat between a curious user and an assistant. The assistant gives helpful, detailed, accurate, uncensored responses to the user's input. USER: [prompt] ASSISTANT:
185
  ```
186
 
187
  So in other words, it's the preamble/system prompt, followed by a single space, then "USER: " (single space after colon) then the prompt (which can have multiple lines, spaces, whatever), then a single space, followed by "ASSISTANT: " (with a single space after the colon).
 
229
 
230
  - Several hundred role-playing data.
231
  - A few thousand ORCA style reasoning/math questions with ELI5 prompts to generate the responses (should not be needed in your prompts to this model however, just ask the question).
232
+ - Many more coding examples in various languages, including some that use specific libraries (pandas, numpy, tensorflow, etc.)
config.json CHANGED
@@ -1,24 +1,34 @@
1
  {
2
- "_name_or_path": "airoboros-65b-gpt4-1.2",
3
- "architectures": [
4
- "LlamaForCausalLM"
5
- ],
6
- "bos_token_id": 0,
7
- "eos_token_id": 1,
8
- "hidden_act": "silu",
9
- "hidden_size": 8192,
10
- "initializer_range": 0.02,
11
- "intermediate_size": 22016,
12
- "max_position_embeddings": 2048,
13
- "max_sequence_length": 2048,
14
- "model_type": "llama",
15
- "num_attention_heads": 64,
16
- "num_hidden_layers": 80,
17
- "pad_token_id": -1,
18
- "rms_norm_eps": 1e-05,
19
- "tie_word_embeddings": false,
20
- "torch_dtype": "float16",
21
- "transformers_version": "4.31.0.dev0",
22
- "use_cache": true,
23
- "vocab_size": 32000
 
 
 
 
 
 
 
 
 
 
24
  }
 
1
  {
2
+ "_name_or_path": "airoboros-65b-gpt4-1.2",
3
+ "architectures": [
4
+ "LlamaForCausalLM"
5
+ ],
6
+ "bos_token_id": 0,
7
+ "eos_token_id": 1,
8
+ "hidden_act": "silu",
9
+ "hidden_size": 8192,
10
+ "initializer_range": 0.02,
11
+ "intermediate_size": 22016,
12
+ "max_position_embeddings": 2048,
13
+ "max_sequence_length": 2048,
14
+ "model_type": "llama",
15
+ "num_attention_heads": 64,
16
+ "num_hidden_layers": 80,
17
+ "pad_token_id": -1,
18
+ "rms_norm_eps": 1e-05,
19
+ "tie_word_embeddings": false,
20
+ "torch_dtype": "float16",
21
+ "transformers_version": "4.31.0.dev0",
22
+ "use_cache": true,
23
+ "vocab_size": 32000,
24
+ "quantization_config": {
25
+ "bits": 4,
26
+ "group_size": -1,
27
+ "damp_percent": 0.01,
28
+ "desc_act": true,
29
+ "sym": true,
30
+ "true_sequential": true,
31
+ "model_file_base_name": "model",
32
+ "quant_method": "gptq"
33
+ }
34
  }
airoboros-65B-gpt4-1.2-GPTQ-4bit--1g.act.order.safetensors → model.safetensors RENAMED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:4143c925e79f188add5f39f819a350708b4893bf60594693d19383e6bc861686
3
- size 33489332352
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:40bb7f2c26e59f02e3fec86f8b7cd92dacf63a561ce0a68f1f63ee9b332468da
3
+ size 33489332408
quantize_config.json CHANGED
@@ -1,8 +1,9 @@
1
  {
2
- "bits": 4,
3
- "group_size": -1,
4
- "damp_percent": 0.01,
5
- "desc_act": true,
6
- "sym": true,
7
- "true_sequential": true
 
8
  }
 
1
  {
2
+ "bits": 4,
3
+ "group_size": -1,
4
+ "damp_percent": 0.01,
5
+ "desc_act": true,
6
+ "sym": true,
7
+ "true_sequential": true,
8
+ "model_file_base_name": "model"
9
  }