TheBloke
/

BLOOMChat-176B-v1-GPTQ

Text Generation

Model card Files Files and versions Community

TheBloke commited on Jul 7, 2023

Commit

6c9c677

•

1 Parent(s): 3dbc347

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -200,7 +200,7 @@ It was created with group_size none (-1) to reduce VRAM usage, and with --act-or
 * `gptq_model-4bit-128g.safetensors`
   * Works with AutoGPTQ in CUDA or Triton modes.
   * Does NOT work with [ExLlama](https://github.com/turboderp/exllama) as it's not a Llama model.
-  * Works with GPTQ-for-LLaMa in CUDA mode.  May have issues with GPTQ-for-LLaMa Triton mode.
   * Works with text-generation-webui, including one-click-installers.
   * Parameters: Groupsize = -1. Act Order / desc_act = True.

 * `gptq_model-4bit-128g.safetensors`
   * Works with AutoGPTQ in CUDA or Triton modes.
   * Does NOT work with [ExLlama](https://github.com/turboderp/exllama) as it's not a Llama model.
+  * Untested with GPTQ-for-LLaMa.
   * Works with text-generation-webui, including one-click-installers.
   * Parameters: Groupsize = -1. Act Order / desc_act = True.