TheBloke
/

OpenAssistant-SFT-7-Llama-30B-GPTQ

@@ -2,6 +2,17 @@
 license: other
 inference: false
 ---
 # OpenAssistant LLaMA 30B SFT 7 GPTQ
@@ -37,7 +48,7 @@ Three sets of models are provided:
   * Uses --act-order for the best possible inference quality given its lack of group_size.
 * Groupsize = 1024
   * Theoretically higher inference accuracy
-  * May OOM on long context lengths in 24GB VRAM
 * Groupsize = 128
   * Optimal setting for highest inference quality
   * Will definitely need more than 24GB VRAM on longer context lengths (1000-1500+ tokens returned)
@@ -48,7 +59,7 @@ For the 128g and 1024g models, two versions are available:
 * `latest.act-order.safetensors`
   * uses `--act-order` for higher inference quality
   * requires more recent GPTQ-for-LLaMa code, therefore will not currently work with one-click-installers
 ## HOW TO CHOOSE YOUR MODEL
 I have used branches to separate the models. This means you can clone the branch you want and not got model files you don't need.
@@ -62,7 +73,7 @@ If you have 24GB VRAM you are strongly recommended to use the file in `main`, wi
 * Branch: **128-latest** = groupsize 128, `latest.act-order.safetensors` file
 ![branches](https://i.imgur.com/PdiHnLxm.png)
 ## How to easily download and run the 1024g compat model in text-generation-webui
 Open the text-generation-webui UI as normal.
@@ -78,7 +89,7 @@ Open the text-generation-webui UI as normal.
 9. Click **Save settings for this model** in the top right.
 10. Click **Reload the Model** in the top right.
 11. Once it says it's loaded, click the **Text Generation tab** and enter a prompt!
 ## Manual instructions for `text-generation-webui`
 The `compat.no-act-order.safetensors` files can be loaded the same as any other GPTQ file, without requiring any updates to [oobaboogas text-generation-webui](https://github.com/oobabooga/text-generation-webui).
@@ -122,6 +133,17 @@ The above commands assume you have installed all dependencies for GPTQ-for-LLaMa
 If you can't update GPTQ-for-LLaMa or don't want to, please use a `compat.no-act-order.safetensor` file.
 # Original model card
 ```
@@ -169,4 +191,4 @@ llama-30b-sft-7:
         max_val_set: 250
 ```
-- **OASST dataset paper:** https://arxiv.org/abs/2304.07327

 license: other
 inference: false
 ---
+<div style="width: 100%;">
+    <img src="https://i.imgur.com/EBdldam.jpg" alt="TheBlokeAI" style="width: 100%; min-width: 400px; display: block; margin: auto;">
+</div>
+<div style="display: flex; justify-content: space-between; width: 100%;">
+    <div style="display: flex; flex-direction: column; align-items: flex-start;">
+        <p><a href="https://discord.gg/UBgz4VXf">Chat & support: my new Discord server</a></p>
+    </div>
+    <div style="display: flex; flex-direction: column; align-items: flex-end;">
+        <p><a href="https://www.patreon.com/TheBlokeAI">Want to contribute? Patreon coming soon!</a></p>
+    </div>
+</div>
 # OpenAssistant LLaMA 30B SFT 7 GPTQ
   * Uses --act-order for the best possible inference quality given its lack of group_size.
 * Groupsize = 1024
   * Theoretically higher inference accuracy
+  * May OOM on long context lengths in 24GB VRAM
 * Groupsize = 128
   * Optimal setting for highest inference quality
   * Will definitely need more than 24GB VRAM on longer context lengths (1000-1500+ tokens returned)
 * `latest.act-order.safetensors`
   * uses `--act-order` for higher inference quality
   * requires more recent GPTQ-for-LLaMa code, therefore will not currently work with one-click-installers
 ## HOW TO CHOOSE YOUR MODEL
 I have used branches to separate the models. This means you can clone the branch you want and not got model files you don't need.
 * Branch: **128-latest** = groupsize 128, `latest.act-order.safetensors` file
 ![branches](https://i.imgur.com/PdiHnLxm.png)
 ## How to easily download and run the 1024g compat model in text-generation-webui
 Open the text-generation-webui UI as normal.
 9. Click **Save settings for this model** in the top right.
 10. Click **Reload the Model** in the top right.
 11. Once it says it's loaded, click the **Text Generation tab** and enter a prompt!
 ## Manual instructions for `text-generation-webui`
 The `compat.no-act-order.safetensors` files can be loaded the same as any other GPTQ file, without requiring any updates to [oobaboogas text-generation-webui](https://github.com/oobabooga/text-generation-webui).
 If you can't update GPTQ-for-LLaMa or don't want to, please use a `compat.no-act-order.safetensor` file.
+## Want to support my work?
+I've had a lot of people ask if they can contribute. I love providing models and helping people, but it is starting to rack up pretty big cloud computing bills.
+So if you're able and willing to contribute, it'd be most gratefully received and will help me to keep providing models, and work on various AI projects.
+Donaters will get priority support on any and all AI/LLM/model questions, and I'll gladly quantise any model you'd like to try.
+* Patreon: coming soon! (just awaiting approval)
+* Ko-Fi: https://ko-fi.com/TheBlokeAI
+* Discord: https://discord.gg/UBgz4VXf
 # Original model card
 ```
         max_val_set: 250
 ```
+- **OASST dataset paper:** https://arxiv.org/abs/2304.07327