Updating model files
Browse files
README.md
CHANGED
@@ -2,6 +2,17 @@
|
|
2 |
license: other
|
3 |
inference: false
|
4 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
5 |
|
6 |
# OpenAssistant LLaMA 30B SFT 7 GPTQ
|
7 |
|
@@ -37,7 +48,7 @@ Three sets of models are provided:
|
|
37 |
* Uses --act-order for the best possible inference quality given its lack of group_size.
|
38 |
* Groupsize = 1024
|
39 |
* Theoretically higher inference accuracy
|
40 |
-
* May OOM on long context lengths in 24GB VRAM
|
41 |
* Groupsize = 128
|
42 |
* Optimal setting for highest inference quality
|
43 |
* Will definitely need more than 24GB VRAM on longer context lengths (1000-1500+ tokens returned)
|
@@ -48,7 +59,7 @@ For the 128g and 1024g models, two versions are available:
|
|
48 |
* `latest.act-order.safetensors`
|
49 |
* uses `--act-order` for higher inference quality
|
50 |
* requires more recent GPTQ-for-LLaMa code, therefore will not currently work with one-click-installers
|
51 |
-
|
52 |
## HOW TO CHOOSE YOUR MODEL
|
53 |
|
54 |
I have used branches to separate the models. This means you can clone the branch you want and not got model files you don't need.
|
@@ -62,7 +73,7 @@ If you have 24GB VRAM you are strongly recommended to use the file in `main`, wi
|
|
62 |
* Branch: **128-latest** = groupsize 128, `latest.act-order.safetensors` file
|
63 |
|
64 |
![branches](https://i.imgur.com/PdiHnLxm.png)
|
65 |
-
|
66 |
## How to easily download and run the 1024g compat model in text-generation-webui
|
67 |
|
68 |
Open the text-generation-webui UI as normal.
|
@@ -78,7 +89,7 @@ Open the text-generation-webui UI as normal.
|
|
78 |
9. Click **Save settings for this model** in the top right.
|
79 |
10. Click **Reload the Model** in the top right.
|
80 |
11. Once it says it's loaded, click the **Text Generation tab** and enter a prompt!
|
81 |
-
|
82 |
## Manual instructions for `text-generation-webui`
|
83 |
|
84 |
The `compat.no-act-order.safetensors` files can be loaded the same as any other GPTQ file, without requiring any updates to [oobaboogas text-generation-webui](https://github.com/oobabooga/text-generation-webui).
|
@@ -122,6 +133,17 @@ The above commands assume you have installed all dependencies for GPTQ-for-LLaMa
|
|
122 |
|
123 |
If you can't update GPTQ-for-LLaMa or don't want to, please use a `compat.no-act-order.safetensor` file.
|
124 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
125 |
# Original model card
|
126 |
|
127 |
```
|
@@ -169,4 +191,4 @@ llama-30b-sft-7:
|
|
169 |
max_val_set: 250
|
170 |
```
|
171 |
|
172 |
-
- **OASST dataset paper:** https://arxiv.org/abs/2304.07327
|
|
|
2 |
license: other
|
3 |
inference: false
|
4 |
---
|
5 |
+
<div style="width: 100%;">
|
6 |
+
<img src="https://i.imgur.com/EBdldam.jpg" alt="TheBlokeAI" style="width: 100%; min-width: 400px; display: block; margin: auto;">
|
7 |
+
</div>
|
8 |
+
<div style="display: flex; justify-content: space-between; width: 100%;">
|
9 |
+
<div style="display: flex; flex-direction: column; align-items: flex-start;">
|
10 |
+
<p><a href="https://discord.gg/UBgz4VXf">Chat & support: my new Discord server</a></p>
|
11 |
+
</div>
|
12 |
+
<div style="display: flex; flex-direction: column; align-items: flex-end;">
|
13 |
+
<p><a href="https://www.patreon.com/TheBlokeAI">Want to contribute? Patreon coming soon!</a></p>
|
14 |
+
</div>
|
15 |
+
</div>
|
16 |
|
17 |
# OpenAssistant LLaMA 30B SFT 7 GPTQ
|
18 |
|
|
|
48 |
* Uses --act-order for the best possible inference quality given its lack of group_size.
|
49 |
* Groupsize = 1024
|
50 |
* Theoretically higher inference accuracy
|
51 |
+
* May OOM on long context lengths in 24GB VRAM
|
52 |
* Groupsize = 128
|
53 |
* Optimal setting for highest inference quality
|
54 |
* Will definitely need more than 24GB VRAM on longer context lengths (1000-1500+ tokens returned)
|
|
|
59 |
* `latest.act-order.safetensors`
|
60 |
* uses `--act-order` for higher inference quality
|
61 |
* requires more recent GPTQ-for-LLaMa code, therefore will not currently work with one-click-installers
|
62 |
+
|
63 |
## HOW TO CHOOSE YOUR MODEL
|
64 |
|
65 |
I have used branches to separate the models. This means you can clone the branch you want and not got model files you don't need.
|
|
|
73 |
* Branch: **128-latest** = groupsize 128, `latest.act-order.safetensors` file
|
74 |
|
75 |
![branches](https://i.imgur.com/PdiHnLxm.png)
|
76 |
+
|
77 |
## How to easily download and run the 1024g compat model in text-generation-webui
|
78 |
|
79 |
Open the text-generation-webui UI as normal.
|
|
|
89 |
9. Click **Save settings for this model** in the top right.
|
90 |
10. Click **Reload the Model** in the top right.
|
91 |
11. Once it says it's loaded, click the **Text Generation tab** and enter a prompt!
|
92 |
+
|
93 |
## Manual instructions for `text-generation-webui`
|
94 |
|
95 |
The `compat.no-act-order.safetensors` files can be loaded the same as any other GPTQ file, without requiring any updates to [oobaboogas text-generation-webui](https://github.com/oobabooga/text-generation-webui).
|
|
|
133 |
|
134 |
If you can't update GPTQ-for-LLaMa or don't want to, please use a `compat.no-act-order.safetensor` file.
|
135 |
|
136 |
+
## Want to support my work?
|
137 |
+
|
138 |
+
I've had a lot of people ask if they can contribute. I love providing models and helping people, but it is starting to rack up pretty big cloud computing bills.
|
139 |
+
|
140 |
+
So if you're able and willing to contribute, it'd be most gratefully received and will help me to keep providing models, and work on various AI projects.
|
141 |
+
|
142 |
+
Donaters will get priority support on any and all AI/LLM/model questions, and I'll gladly quantise any model you'd like to try.
|
143 |
+
|
144 |
+
* Patreon: coming soon! (just awaiting approval)
|
145 |
+
* Ko-Fi: https://ko-fi.com/TheBlokeAI
|
146 |
+
* Discord: https://discord.gg/UBgz4VXf
|
147 |
# Original model card
|
148 |
|
149 |
```
|
|
|
191 |
max_val_set: 250
|
192 |
```
|
193 |
|
194 |
+
- **OASST dataset paper:** https://arxiv.org/abs/2304.07327
|