TheBloke commited on
Commit
78e749e
1 Parent(s): f6ecb3e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -22
README.md CHANGED
@@ -13,7 +13,7 @@ inference: false
13
  </div>
14
  <div style="display: flex; justify-content: space-between; width: 100%;">
15
  <div style="display: flex; flex-direction: column; align-items: flex-start;">
16
- <p><a href="https://discord.gg/Jq4vkcDakD">Chat & support: my new Discord server</a></p>
17
  </div>
18
  <div style="display: flex; flex-direction: column; align-items: flex-end;">
19
  <p><a href="https://www.patreon.com/TheBlokeAI">Want to contribute? TheBloke's Patreon page</a></p>
@@ -27,15 +27,11 @@ This repo contains an experimantal GPTQ 4bit model for [Falcon-7B-Instruct](http
27
 
28
  It is the result of quantising to 4bit using [AutoGPTQ](https://github.com/PanQiWei/AutoGPTQ).
29
 
30
- ## Need support? Want to discuss? I now have a Discord!
31
 
32
- Join me at: https://discord.gg/UBgz4VXf
33
 
34
- ## EXPERIMENTAL
35
-
36
- Please note this is an experimental GPTQ model. Support for it is currently quite limited.
37
-
38
- It is also expected to be **SLOW**. This is currently unavoidable, but is being looked at.
39
 
40
  ## Prompt template
41
 
@@ -47,7 +43,7 @@ Assistant:
47
 
48
  ## AutoGPTQ
49
 
50
- AutoGPTQ is required: `pip install auto-gptq`
51
 
52
  AutoGPTQ provides pre-compiled wheels for Windows and Linux, with CUDA toolkit 11.7 or 11.8.
53
 
@@ -61,14 +57,6 @@ pip install .
61
 
62
  These manual steps will require that you have the [Nvidia CUDA toolkit](https://developer.nvidia.com/cuda-12-0-1-download-archive) installed.
63
 
64
- ## text-generation-webui
65
-
66
- There is provisional AutoGPTQ support in text-generation-webui.
67
-
68
- This requires text-generation-webui as of commit 204731952ae59d79ea3805a425c73dd171d943c3.
69
-
70
- So please first update text-genration-webui to the latest version.
71
-
72
  ## How to download and use this model in text-generation-webui
73
 
74
  1. Launch text-generation-webui
@@ -79,7 +67,7 @@ So please first update text-genration-webui to the latest version.
79
  6. Wait until it says it's finished downloading.
80
  7. Click the **Refresh** icon next to **Model** in the top left.
81
  8. In the **Model drop-down**: choose the model you just downloaded, `falcon-7B-instruct-GPTQ`.
82
- 9. Make sure **Loader** is set to **AutoGPTQ**. This model will not work with ExLlama or GPTQ-for-LLaMa.
83
  10. Tick **Trust Remote Code**, followed by **Save Settings**
84
  11. Click **Reload**.
85
  12. Once it says it's loaded, click the **Text Generation tab** and enter a prompt!
@@ -96,7 +84,7 @@ In this repo you can see two `.py` files - these are the files that get executed
96
 
97
  To run this code you need to install AutoGPTQ and einops:
98
  ```
99
- pip install auto-gptq
100
  pip install einops
101
  ```
102
 
@@ -127,7 +115,7 @@ model = AutoGPTQForCausalLM.from_quantized(model_name_or_path,
127
  prompt = "Tell me about AI"
128
  prompt_template=f'''A helpful assistant who helps the user with any questions asked.
129
  User: {prompt}
130
- Assistant:''''
131
 
132
  print("\n\n*** Generate:")
133
 
@@ -175,7 +163,7 @@ It was created with groupsize 64 to give higher inference quality, and without `
175
 
176
  For further support, and discussions on these models and AI in general, join us at:
177
 
178
- [TheBloke AI's Discord server](https://discord.gg/Jq4vkcDakD)
179
 
180
  ## Thanks, and how to contribute.
181
 
@@ -190,9 +178,12 @@ Donaters will get priority support on any and all AI/LLM/model questions and req
190
  * Patreon: https://patreon.com/TheBlokeAI
191
  * Ko-Fi: https://ko-fi.com/TheBlokeAI
192
 
193
- **Patreon special mentions**: Aemon Algiz, Dmitriy Samsonov, Nathan LeClaire, Trenton Dambrowitz, Mano Prime, David Flickinger, vamX, Nikolai Manek, senxiiz, Khalefa Al-Ahmad, Illia Dulskyi, Jonathan Leane, Talal Aujan, V. Lukas, Joseph William Delisle, Pyrater, Oscar Rangel, Lone Striker, Luke Pendergrass, Eugene Pentland, Sebastain Graf, Johann-Peter Hartman.
 
 
194
 
195
  Thank you to all my generous patrons and donaters!
 
196
  <!-- footer end -->
197
 
198
  # ✨ Original model card: Falcon-7B-Instruct
 
13
  </div>
14
  <div style="display: flex; justify-content: space-between; width: 100%;">
15
  <div style="display: flex; flex-direction: column; align-items: flex-start;">
16
+ <p><a href="https://discord.gg/theblokeai">Chat & support: my new Discord server</a></p>
17
  </div>
18
  <div style="display: flex; flex-direction: column; align-items: flex-end;">
19
  <p><a href="https://www.patreon.com/TheBlokeAI">Want to contribute? TheBloke's Patreon page</a></p>
 
27
 
28
  It is the result of quantising to 4bit using [AutoGPTQ](https://github.com/PanQiWei/AutoGPTQ).
29
 
30
+ ## PERFORMANCE
31
 
32
+ Please note that performance with this GPTQ is currently very slow with AutoGPTQ.
33
 
34
+ It may perform better with the latest GPTQ-for-LLaMa code, but I havne't tested that personally yet.
 
 
 
 
35
 
36
  ## Prompt template
37
 
 
43
 
44
  ## AutoGPTQ
45
 
46
+ AutoGPTQ is required: `GITHUB_ACTIONS=true pip install auto-gptq`
47
 
48
  AutoGPTQ provides pre-compiled wheels for Windows and Linux, with CUDA toolkit 11.7 or 11.8.
49
 
 
57
 
58
  These manual steps will require that you have the [Nvidia CUDA toolkit](https://developer.nvidia.com/cuda-12-0-1-download-archive) installed.
59
 
 
 
 
 
 
 
 
 
60
  ## How to download and use this model in text-generation-webui
61
 
62
  1. Launch text-generation-webui
 
67
  6. Wait until it says it's finished downloading.
68
  7. Click the **Refresh** icon next to **Model** in the top left.
69
  8. In the **Model drop-down**: choose the model you just downloaded, `falcon-7B-instruct-GPTQ`.
70
+ 9. Set **Loader** to **AutoGPTQ**. This model will not work with ExLlama. It might work with recent GPTQ-for-LLaMa but I haven't tested that.
71
  10. Tick **Trust Remote Code**, followed by **Save Settings**
72
  11. Click **Reload**.
73
  12. Once it says it's loaded, click the **Text Generation tab** and enter a prompt!
 
84
 
85
  To run this code you need to install AutoGPTQ and einops:
86
  ```
87
+ GITHUB_ACTIONS=true pip install auto-gptq
88
  pip install einops
89
  ```
90
 
 
115
  prompt = "Tell me about AI"
116
  prompt_template=f'''A helpful assistant who helps the user with any questions asked.
117
  User: {prompt}
118
+ Assistant:'''
119
 
120
  print("\n\n*** Generate:")
121
 
 
163
 
164
  For further support, and discussions on these models and AI in general, join us at:
165
 
166
+ [TheBloke AI's Discord server](https://discord.gg/theblokeai)
167
 
168
  ## Thanks, and how to contribute.
169
 
 
178
  * Patreon: https://patreon.com/TheBlokeAI
179
  * Ko-Fi: https://ko-fi.com/TheBlokeAI
180
 
181
+ **Special thanks to**: Luke from CarbonQuill, Aemon Algiz.
182
+
183
+ **Patreon special mentions**: RoA, Lone Striker, Gabriel Puliatti, Derek Yates, Randy H, Jonathan Leane, Eugene Pentland, Karl Bernard, Viktor Bowallius, senxiiz, Daniel P. Andersen, Pierre Kircher, Deep Realms, Cory Kujawski, Oscar Rangel, Fen Risland, Ajan Kanaga, LangChain4j, webtim, Nikolai Manek, Trenton Dambrowitz, Raven Klaugh, Kalila, Khalefa Al-Ahmad, Chris McCloskey, Luke @flexchar, Ai Maven, Dave, Asp the Wyvern, Sean Connelly, Imad Khwaja, Space Cruiser, Rainer Wilmers, subjectnull, Alps Aficionado, Willian Hasse, Fred von Graf, Artur Olbinski, Johann-Peter Hartmann, WelcomeToTheClub, Willem Michiel, Michael Levine, Iucharbius , Spiking Neurons AB, K, biorpg, John Villwock, Pyrater, Greatston Gnanesh, Mano Prime, Junyu Yang, Stephen Murray, John Detwiler, Luke Pendergrass, terasurfer , Pieter, zynix , Edmond Seymore, theTransient, Nathan LeClaire, vamX, Kevin Schuppel, Preetika Verma, ya boyyy, Alex , SuperWojo, Ghost , Joseph William Delisle, Matthew Berman, Talal Aujan, chris gileta, Illia Dulskyi.
184
 
185
  Thank you to all my generous patrons and donaters!
186
+
187
  <!-- footer end -->
188
 
189
  # ✨ Original model card: Falcon-7B-Instruct