Text Generation
Transformers
Safetensors
English
llama
text-generation-inference
4-bit precision
gptq
TheBloke commited on
Commit
53b7045
1 Parent(s): 3174c37

Initial GPTQ model commit

Browse files
Files changed (1) hide show
  1. README.md +9 -5
README.md CHANGED
@@ -26,9 +26,13 @@ pipeline_tag: text-generation
26
  </div>
27
  <!-- header end -->
28
 
29
- # Stability AI's FreeWilly 2 GPTQ
 
 
30
 
31
- These files are GPTQ model files for [Stability AI's FreeWilly 2](https://huggingface.co/stabilityai/FreeWilly2).
 
 
32
 
33
  Multiple GPTQ parameter permutations are provided; see Provided Files below for details of the options provided, their parameters, and the software used to create them.
34
 
@@ -36,7 +40,7 @@ Multiple GPTQ parameter permutations are provided; see Provided Files below for
36
  ## Repositories available
37
 
38
  * [GPTQ models for GPU inference, with multiple quantisation parameter options.](https://huggingface.co/TheBloke/FreeWilly2-GPTQ)
39
- * [Original unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/stabilityai/FreeWilly2)
40
 
41
  ## Prompt template: Orca-Hashes
42
 
@@ -64,8 +68,8 @@ Each separate quant is in a different branch. See below for instructions on fet
64
  | gptq-4bit-128g-actorder_True | 4 | 128 | True | Processing, coming soon | True | AutoGPTQ | 4-bit, with Act Order and group size. 128g uses even less VRAM, but with slightly lower accuracy. Poor AutoGPTQ CUDA speed. |
65
  | gptq-3bit--1g-actorder_True | 3 | None | True | 26.78 GB | False | AutoGPTQ | 3-bit, with Act Order and no group size. Lowest possible VRAM requirements. May be lower quality than 3-bit 128g. |
66
  | gptq-3bit-128g-actorder_False | 3 | 128 | False | 28.03 GB | False | AutoGPTQ | 3-bit, with group size 128g but no act-order. Slightly higher VRAM requirements than 3-bit None. |
67
- | gptq-3bit-128g-actorder_True | 3 | 128 | True | Processing, coming soon | False | AutoGPTQ | 3-bit, with group size 128g and act-order. Higher quality than 128g-False but poor AutoGPTQ CUDA speed. |
68
- | gptq-3bit-64g-actorder_True | 3 | 64 | True | Processing, coming soon | False | AutoGPTQ | 3-bit, with group size 64g and act-order. Highest quality 3-bit option. Poor AutoGPTQ CUDA speed. |
69
 
70
  ## How to download from branches
71
 
 
26
  </div>
27
  <!-- header end -->
28
 
29
+ # FreeWilly 2 - GPTQ
30
+ - Model creator: Stability AI
31
+ - Original model: [FreeWilly 2](https://huggingface.co/stabilityai/FreeWilly2)
32
 
33
+ ## Description
34
+
35
+ These repo contains GPTQ model files for [Stability AI's FreeWilly 2](https://huggingface.co/stabilityai/FreeWilly2).
36
 
37
  Multiple GPTQ parameter permutations are provided; see Provided Files below for details of the options provided, their parameters, and the software used to create them.
38
 
 
40
  ## Repositories available
41
 
42
  * [GPTQ models for GPU inference, with multiple quantisation parameter options.](https://huggingface.co/TheBloke/FreeWilly2-GPTQ)
43
+ * [Stability AI's original unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/stabilityai/FreeWilly2)
44
 
45
  ## Prompt template: Orca-Hashes
46
 
 
68
  | gptq-4bit-128g-actorder_True | 4 | 128 | True | Processing, coming soon | True | AutoGPTQ | 4-bit, with Act Order and group size. 128g uses even less VRAM, but with slightly lower accuracy. Poor AutoGPTQ CUDA speed. |
69
  | gptq-3bit--1g-actorder_True | 3 | None | True | 26.78 GB | False | AutoGPTQ | 3-bit, with Act Order and no group size. Lowest possible VRAM requirements. May be lower quality than 3-bit 128g. |
70
  | gptq-3bit-128g-actorder_False | 3 | 128 | False | 28.03 GB | False | AutoGPTQ | 3-bit, with group size 128g but no act-order. Slightly higher VRAM requirements than 3-bit None. |
71
+ | gptq-3bit-128g-actorder_True | 3 | 128 | True | 28.03 GB | False | AutoGPTQ | 3-bit, with group size 128g and act-order. Higher quality than 128g-False but poor AutoGPTQ CUDA speed. |
72
+ | gptq-3bit-64g-actorder_True | 3 | 64 | True | 29.30 GB | False | AutoGPTQ | 3-bit, with group size 64g and act-order. Highest quality 3-bit option. Poor AutoGPTQ CUDA speed. |
73
 
74
  ## How to download from branches
75