Initial GPTQ model commit
Browse files
README.md
CHANGED
@@ -26,9 +26,13 @@ pipeline_tag: text-generation
|
|
26 |
</div>
|
27 |
<!-- header end -->
|
28 |
|
29 |
-
#
|
|
|
|
|
30 |
|
31 |
-
|
|
|
|
|
32 |
|
33 |
Multiple GPTQ parameter permutations are provided; see Provided Files below for details of the options provided, their parameters, and the software used to create them.
|
34 |
|
@@ -36,7 +40,7 @@ Multiple GPTQ parameter permutations are provided; see Provided Files below for
|
|
36 |
## Repositories available
|
37 |
|
38 |
* [GPTQ models for GPU inference, with multiple quantisation parameter options.](https://huggingface.co/TheBloke/FreeWilly2-GPTQ)
|
39 |
-
* [
|
40 |
|
41 |
## Prompt template: Orca-Hashes
|
42 |
|
@@ -64,8 +68,8 @@ Each separate quant is in a different branch. See below for instructions on fet
|
|
64 |
| gptq-4bit-128g-actorder_True | 4 | 128 | True | Processing, coming soon | True | AutoGPTQ | 4-bit, with Act Order and group size. 128g uses even less VRAM, but with slightly lower accuracy. Poor AutoGPTQ CUDA speed. |
|
65 |
| gptq-3bit--1g-actorder_True | 3 | None | True | 26.78 GB | False | AutoGPTQ | 3-bit, with Act Order and no group size. Lowest possible VRAM requirements. May be lower quality than 3-bit 128g. |
|
66 |
| gptq-3bit-128g-actorder_False | 3 | 128 | False | 28.03 GB | False | AutoGPTQ | 3-bit, with group size 128g but no act-order. Slightly higher VRAM requirements than 3-bit None. |
|
67 |
-
| gptq-3bit-128g-actorder_True | 3 | 128 | True |
|
68 |
-
| gptq-3bit-64g-actorder_True | 3 | 64 | True |
|
69 |
|
70 |
## How to download from branches
|
71 |
|
|
|
26 |
</div>
|
27 |
<!-- header end -->
|
28 |
|
29 |
+
# FreeWilly 2 - GPTQ
|
30 |
+
- Model creator: Stability AI
|
31 |
+
- Original model: [FreeWilly 2](https://huggingface.co/stabilityai/FreeWilly2)
|
32 |
|
33 |
+
## Description
|
34 |
+
|
35 |
+
These repo contains GPTQ model files for [Stability AI's FreeWilly 2](https://huggingface.co/stabilityai/FreeWilly2).
|
36 |
|
37 |
Multiple GPTQ parameter permutations are provided; see Provided Files below for details of the options provided, their parameters, and the software used to create them.
|
38 |
|
|
|
40 |
## Repositories available
|
41 |
|
42 |
* [GPTQ models for GPU inference, with multiple quantisation parameter options.](https://huggingface.co/TheBloke/FreeWilly2-GPTQ)
|
43 |
+
* [Stability AI's original unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/stabilityai/FreeWilly2)
|
44 |
|
45 |
## Prompt template: Orca-Hashes
|
46 |
|
|
|
68 |
| gptq-4bit-128g-actorder_True | 4 | 128 | True | Processing, coming soon | True | AutoGPTQ | 4-bit, with Act Order and group size. 128g uses even less VRAM, but with slightly lower accuracy. Poor AutoGPTQ CUDA speed. |
|
69 |
| gptq-3bit--1g-actorder_True | 3 | None | True | 26.78 GB | False | AutoGPTQ | 3-bit, with Act Order and no group size. Lowest possible VRAM requirements. May be lower quality than 3-bit 128g. |
|
70 |
| gptq-3bit-128g-actorder_False | 3 | 128 | False | 28.03 GB | False | AutoGPTQ | 3-bit, with group size 128g but no act-order. Slightly higher VRAM requirements than 3-bit None. |
|
71 |
+
| gptq-3bit-128g-actorder_True | 3 | 128 | True | 28.03 GB | False | AutoGPTQ | 3-bit, with group size 128g and act-order. Higher quality than 128g-False but poor AutoGPTQ CUDA speed. |
|
72 |
+
| gptq-3bit-64g-actorder_True | 3 | 64 | True | 29.30 GB | False | AutoGPTQ | 3-bit, with group size 64g and act-order. Highest quality 3-bit option. Poor AutoGPTQ CUDA speed. |
|
73 |
|
74 |
## How to download from branches
|
75 |
|