TheBloke
/

guanaco-13B-GPTQ

Text Generation

text-generation-inference

4-bit precision

Model card Files Files and versions Community

TheBloke commited on May 25, 2023

Commit

b2d0b56

•

1 Parent(s): 7d845a2

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -39,7 +39,7 @@ In the `main` branch you will find `Guanaco-13B-GPTQ-4bit-128g.no-act-order.safe
 This will work with all versions of GPTQ-for-LLaMa. It has maximum compatibility.
-It was created without groupsize to minimise VRAM requirements, to keep it under 24GB VRAM. It was created with the `--act-order` parameter to maximise accuracy.
 * `Guanaco-13B-GPTQ-4bit-128g.no-act-order.safetensors`
   * Works with all versions of GPTQ-for-LLaMa code, both Triton and CUDA branches

 This will work with all versions of GPTQ-for-LLaMa. It has maximum compatibility.
+It was created with groupsize 128 to ensure higher quality inference, and without `--act-order` to maximise compatibility.
 * `Guanaco-13B-GPTQ-4bit-128g.no-act-order.safetensors`
   * Works with all versions of GPTQ-for-LLaMa code, both Triton and CUDA branches