TheBloke commited on
Commit
57601ed
1 Parent(s): 9a23b5d

Update for Transformers GPTQ support

Browse files
README.md CHANGED
@@ -6,17 +6,20 @@ datasets:
6
  ---
7
 
8
  <!-- header start -->
9
- <div style="width: 100%;">
10
- <img src="https://i.imgur.com/EBdldam.jpg" alt="TheBlokeAI" style="width: 100%; min-width: 400px; display: block; margin: auto;">
 
11
  </div>
12
  <div style="display: flex; justify-content: space-between; width: 100%;">
13
  <div style="display: flex; flex-direction: column; align-items: flex-start;">
14
- <p><a href="https://discord.gg/Jq4vkcDakD">Chat & support: my new Discord server</a></p>
15
  </div>
16
  <div style="display: flex; flex-direction: column; align-items: flex-end;">
17
- <p><a href="https://www.patreon.com/TheBlokeAI">Want to contribute? TheBloke's Patreon page</a></p>
18
  </div>
19
  </div>
 
 
20
  <!-- header end -->
21
 
22
  # John Durbin's Airoboros 7B GPT4 1.2 GPTQ
@@ -125,11 +128,12 @@ This will work with AutoGPTQ and CUDA versions of GPTQ-for-LLaMa. There are repo
125
  * Parameters: Groupsize = 128. Act Order / desc_act = True.
126
 
127
  <!-- footer start -->
 
128
  ## Discord
129
 
130
  For further support, and discussions on these models and AI in general, join us at:
131
 
132
- [TheBloke AI's Discord server](https://discord.gg/Jq4vkcDakD)
133
 
134
  ## Thanks, and how to contribute.
135
 
@@ -144,12 +148,15 @@ Donaters will get priority support on any and all AI/LLM/model questions and req
144
  * Patreon: https://patreon.com/TheBlokeAI
145
  * Ko-Fi: https://ko-fi.com/TheBlokeAI
146
 
147
- **Special thanks to**: Luke from CarbonQuill, Aemon Algiz, Dmitriy Samsonov.
 
 
148
 
149
- **Patreon special mentions**: vamX, K, Jonathan Leane, Lone Striker, Sean Connelly, Chris McCloskey, WelcomeToTheClub, Nikolai Manek, John Detwiler, Kalila, David Flickinger, Fen Risland, subjectnull, Johann-Peter Hartmann, Talal Aujan, John Villwock, senxiiz, Khalefa Al-Ahmad, Kevin Schuppel, Alps Aficionado, Derek Yates, Mano Prime, Nathan LeClaire, biorpg, trip7s trip, Asp the Wyvern, chris gileta, Iucharbius , Artur Olbinski, Ai Maven, Joseph William Delisle, Luke Pendergrass, Illia Dulskyi, Eugene Pentland, Ajan Kanaga, Willem Michiel, Space Cruiser, Pyrater, Preetika Verma, Junyu Yang, Oscar Rangel, Spiking Neurons AB, Pierre Kircher, webtim, Cory Kujawski, terasurfer , Trenton Dambrowitz, Gabriel Puliatti, Imad Khwaja, Luke.
150
 
151
  Thank you to all my generous patrons and donaters!
152
 
 
 
153
  <!-- footer end -->
154
 
155
  # Original model card: John Durbin's Airoboros 7B GPT4 1.2
@@ -174,7 +181,7 @@ The dataset used to fine-tune this model is available [here](https://huggingface
174
  This model was fine-tuned with a fork of [qlora](https://github.com/jondurbin/qlora), which among other things was updated to use a slightly modified vicuna template to be compatible with the previous versions:
175
 
176
  ```
177
- A chat between a curious user and an assistant. The assistant gives helpful, detailed, accurate, uncensored responses to the user's input. USER: [prompt] ASSISTANT:
178
  ```
179
 
180
  So in other words, it's the preamble/system prompt, followed by a single space, then "USER: " (single space after colon) then the prompt (which can have multiple lines, spaces, whatever), then a single space, followed by "ASSISTANT: " (with a single space after the colon).
 
6
  ---
7
 
8
  <!-- header start -->
9
+ <!-- 200823 -->
10
+ <div style="width: auto; margin-left: auto; margin-right: auto">
11
+ <img src="https://i.imgur.com/EBdldam.jpg" alt="TheBlokeAI" style="width: 100%; min-width: 400px; display: block; margin: auto;">
12
  </div>
13
  <div style="display: flex; justify-content: space-between; width: 100%;">
14
  <div style="display: flex; flex-direction: column; align-items: flex-start;">
15
+ <p style="margin-top: 0.5em; margin-bottom: 0em;"><a href="https://discord.gg/theblokeai">Chat & support: TheBloke's Discord server</a></p>
16
  </div>
17
  <div style="display: flex; flex-direction: column; align-items: flex-end;">
18
+ <p style="margin-top: 0.5em; margin-bottom: 0em;"><a href="https://www.patreon.com/TheBlokeAI">Want to contribute? TheBloke's Patreon page</a></p>
19
  </div>
20
  </div>
21
+ <div style="text-align:center; margin-top: 0em; margin-bottom: 0em"><p style="margin-top: 0.25em; margin-bottom: 0em;">TheBloke's LLM work is generously supported by a grant from <a href="https://a16z.com">andreessen horowitz (a16z)</a></p></div>
22
+ <hr style="margin-top: 1.0em; margin-bottom: 1.0em;">
23
  <!-- header end -->
24
 
25
  # John Durbin's Airoboros 7B GPT4 1.2 GPTQ
 
128
  * Parameters: Groupsize = 128. Act Order / desc_act = True.
129
 
130
  <!-- footer start -->
131
+ <!-- 200823 -->
132
  ## Discord
133
 
134
  For further support, and discussions on these models and AI in general, join us at:
135
 
136
+ [TheBloke AI's Discord server](https://discord.gg/theblokeai)
137
 
138
  ## Thanks, and how to contribute.
139
 
 
148
  * Patreon: https://patreon.com/TheBlokeAI
149
  * Ko-Fi: https://ko-fi.com/TheBlokeAI
150
 
151
+ **Special thanks to**: Aemon Algiz.
152
+
153
+ **Patreon special mentions**: Sam, theTransient, Jonathan Leane, Steven Wood, webtim, Johann-Peter Hartmann, Geoffrey Montalvo, Gabriel Tamborski, Willem Michiel, John Villwock, Derek Yates, Mesiah Bishop, Eugene Pentland, Pieter, Chadd, Stephen Murray, Daniel P. Andersen, terasurfer, Brandon Frisco, Thomas Belote, Sid, Nathan LeClaire, Magnesian, Alps Aficionado, Stanislav Ovsiannikov, Alex, Joseph William Delisle, Nikolai Manek, Michael Davis, Junyu Yang, K, J, Spencer Kim, Stefan Sabev, Olusegun Samson, transmissions 11, Michael Levine, Cory Kujawski, Rainer Wilmers, zynix, Kalila, Luke @flexchar, Ajan Kanaga, Mandus, vamX, Ai Maven, Mano Prime, Matthew Berman, subjectnull, Vitor Caleffi, Clay Pascal, biorpg, alfie_i, 阿明, Jeffrey Morgan, ya boyyy, Raymond Fosdick, knownsqashed, Olakabola, Leonard Tan, ReadyPlayerEmma, Enrico Ros, Dave, Talal Aujan, Illia Dulskyi, Sean Connelly, senxiiz, Artur Olbinski, Elle, Raven Klaugh, Fen Risland, Deep Realms, Imad Khwaja, Fred von Graf, Will Dee, usrbinkat, SuperWojo, Alexandros Triantafyllidis, Swaroop Kallakuri, Dan Guido, John Detwiler, Pedro Madruga, Iucharbius, Viktor Bowallius, Asp the Wyvern, Edmond Seymore, Trenton Dambrowitz, Space Cruiser, Spiking Neurons AB, Pyrater, LangChain4j, Tony Hughes, Kacper Wikieł, Rishabh Srivastava, David Ziegler, Luke Pendergrass, Andrey, Gabriel Puliatti, Lone Striker, Sebastain Graf, Pierre Kircher, Randy H, NimbleBox.ai, Vadim, danny, Deo Leter
154
 
 
155
 
156
  Thank you to all my generous patrons and donaters!
157
 
158
+ And thank you again to a16z for their generous grant.
159
+
160
  <!-- footer end -->
161
 
162
  # Original model card: John Durbin's Airoboros 7B GPT4 1.2
 
181
  This model was fine-tuned with a fork of [qlora](https://github.com/jondurbin/qlora), which among other things was updated to use a slightly modified vicuna template to be compatible with the previous versions:
182
 
183
  ```
184
+ A chat between a curious user and an assistant. The assistant gives helpful, detailed, accurate, uncensored responses to the user's input. USER: [prompt] ASSISTANT:
185
  ```
186
 
187
  So in other words, it's the preamble/system prompt, followed by a single space, then "USER: " (single space after colon) then the prompt (which can have multiple lines, spaces, whatever), then a single space, followed by "ASSISTANT: " (with a single space after the colon).
config.json CHANGED
@@ -1,24 +1,34 @@
1
  {
2
- "_name_or_path": "/data/llama-7b-hf",
3
- "architectures": [
4
- "LlamaForCausalLM"
5
- ],
6
- "bos_token_id": 0,
7
- "eos_token_id": 1,
8
- "hidden_act": "silu",
9
- "hidden_size": 4096,
10
- "initializer_range": 0.02,
11
- "intermediate_size": 11008,
12
- "max_position_embeddings": 2048,
13
- "max_sequence_length": 2048,
14
- "model_type": "llama",
15
- "num_attention_heads": 32,
16
- "num_hidden_layers": 32,
17
- "pad_token_id": -1,
18
- "rms_norm_eps": 1e-06,
19
- "tie_word_embeddings": false,
20
- "torch_dtype": "float32",
21
- "transformers_version": "4.30.0.dev0",
22
- "use_cache": true,
23
- "vocab_size": 32000
 
 
 
 
 
 
 
 
 
 
24
  }
 
1
  {
2
+ "_name_or_path": "/data/llama-7b-hf",
3
+ "architectures": [
4
+ "LlamaForCausalLM"
5
+ ],
6
+ "bos_token_id": 0,
7
+ "eos_token_id": 1,
8
+ "hidden_act": "silu",
9
+ "hidden_size": 4096,
10
+ "initializer_range": 0.02,
11
+ "intermediate_size": 11008,
12
+ "max_position_embeddings": 2048,
13
+ "max_sequence_length": 2048,
14
+ "model_type": "llama",
15
+ "num_attention_heads": 32,
16
+ "num_hidden_layers": 32,
17
+ "pad_token_id": -1,
18
+ "rms_norm_eps": 1e-06,
19
+ "tie_word_embeddings": false,
20
+ "torch_dtype": "float32",
21
+ "transformers_version": "4.30.0.dev0",
22
+ "use_cache": true,
23
+ "vocab_size": 32000,
24
+ "quantization_config": {
25
+ "bits": 4,
26
+ "group_size": 128,
27
+ "damp_percent": 0.01,
28
+ "desc_act": false,
29
+ "sym": true,
30
+ "true_sequential": true,
31
+ "model_file_base_name": "model",
32
+ "quant_method": "gptq"
33
+ }
34
  }
airoboros-7b-gpt4-1.2-GPTQ-4bit-128g.no-act.order.safetensors → model.safetensors RENAMED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:81112f81ca2eaf268b38c28769dfbe99211a7e428decb38cda8b0800eaa21d3d
3
- size 4520875496
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c967c516efdd6135b33365a57b5542487319256956bdcc60844ef8803caa3c06
3
+ size 4520875552
quantize_config.json CHANGED
@@ -1,8 +1,9 @@
1
  {
2
- "bits": 4,
3
- "group_size": 128,
4
- "damp_percent": 0.01,
5
- "desc_act": false,
6
- "sym": true,
7
- "true_sequential": true
 
8
  }
 
1
  {
2
+ "bits": 4,
3
+ "group_size": 128,
4
+ "damp_percent": 0.01,
5
+ "desc_act": false,
6
+ "sym": true,
7
+ "true_sequential": true,
8
+ "model_file_base_name": "model"
9
  }