TheBloke
/

open-llama-7B-v2-open-instruct-GPTQ

@@ -49,12 +49,17 @@ Each separate quant is in a different branch.  See below for instructions on fet
 | Branch | Filename | Bits | Group Size | Act Order (desc_act) | File Size | ExLlama Compatible? | Made With | Use |
 | ------ | -------- | ---- | ---------- | -------------------- | --------- | ------------------- | --------- | --- |
 | main | open-llama-7b-v2-open-instruct-GPTQ-4bit-128g.no-act.order.safetensors | 4 | 128 | False | 4.00 GB | True | GPTQ-for-LLaMa | Most compatible option. Good inference speed in AutoGPTQ and GPTQ-for-LLaMa. |
-| gptq-4bit-32g-actorder_True | gptq_model-4bit-32g.safetensors | 4 | 32 | True | 4.28 GB | True | AutoGPTQ | Group size 32g gives highest possible inference quality, with maximum VRAM usage. |
-| gptq-4bit-64g-actorder_True | gptq_model-4bit-64g.safetensors | 4 | 64 | True | 4.02 GB | True | AutoGPTQ | Group size 64g uses less VRAM, but with slightly lower accuracy. |
-| gptq-4bit-128g-actorder_True | gptq_model-4bit-128g.safetensors | 4 | 128 | True | 3.90 GB | True | AutoGPTQ | Group size 128g uses even less VRAM, but with slightly lower accuracy. |
-| gptq-8bit--1g-actorder_True | gptq_model-8bit--1g.safetensors | 8 | None | True | 7.01 GB | False | AutoGPTQ | Group size none has the least VRAM usage, but the lowest accuracy. |
 ## How to easily download and use this model in [text-generation-webui](https://github.com/oobabooga/text-generation-webui).
 Please make sure you're using the latest version of [text-generation-webui](https://github.com/oobabooga/text-generation-webui).
@@ -63,6 +68,8 @@ It is strongly recommended to use the text-generation-webui one-click-installers
 1. Click the **Model tab**.
 2. Under **Download custom model or LoRA**, enter `TheBloke/open-llama-7B-v2-open-instruct-GPTQ`.
 3. Click **Download**.
 4. The model will start downloading. Once it's finished it will say "Done"
 5. In the top left, click the refresh icon next to **Model**.

 | Branch | Filename | Bits | Group Size | Act Order (desc_act) | File Size | ExLlama Compatible? | Made With | Use |
 | ------ | -------- | ---- | ---------- | -------------------- | --------- | ------------------- | --------- | --- |
 | main | open-llama-7b-v2-open-instruct-GPTQ-4bit-128g.no-act.order.safetensors | 4 | 128 | False | 4.00 GB | True | GPTQ-for-LLaMa | Most compatible option. Good inference speed in AutoGPTQ and GPTQ-for-LLaMa. |
+| gptq-4bit-32g-actorder_True | gptq_model-4bit-32g.safetensors | 4 | 32 | True | 4.28 GB | True | AutoGPTQ | 4-bit, with Act Order. Group size 32g gives highest possible inference quality, with maximum VRAM usage. |
+| gptq-4bit-64g-actorder_True | gptq_model-4bit-64g.safetensors | 4 | 64 | True | 4.02 GB | True | AutoGPTQ | 4-bit, with Act Order. Group size 64g uses less VRAM, but with slightly lower accuracy. |
+| gptq-4bit-128g-actorder_True | gptq_model-4bit-128g.safetensors | 4 | 128 | True | 3.90 GB | True | AutoGPTQ | 4-bit, with Act Order. Group size 128g uses even less VRAM, but with slightly lower accuracy. |
+| gptq-8bit--1g-actorder_True | gptq_model-8bit--1g.safetensors | 8 | None | True | 7.01 GB | False | AutoGPTQ | 8-bit, with Act Order. Group size none is used to lower VRAM requirements and to improve compatibility. |
+## How to download from branches
+- In text-generation-webui, you can add `:branch` to the end of the download name, eg `TheBloke/open-llama-7B-v2-open-instruct-GPTQ:gptq-4bit-32g-actorder_True`
+- With Git, you can clone with: `git clone --branch gptq-4bit-32g-actorder_True https://huggingface.co/TheBloke/open-llama-7B-v2-open-instruct-GPTQ`.
 ## How to easily download and use this model in [text-generation-webui](https://github.com/oobabooga/text-generation-webui).
 Please make sure you're using the latest version of [text-generation-webui](https://github.com/oobabooga/text-generation-webui).
 1. Click the **Model tab**.
 2. Under **Download custom model or LoRA**, enter `TheBloke/open-llama-7B-v2-open-instruct-GPTQ`.
+  - To download from a specific branch, enter for example `TheBloke/open-llama-7B-v2-open-instruct-GPTQ:gptq-4bit-32g-actorder_True`
+  - see Provided Files above for the list of branches for each file type.
 3. Click **Download**.
 4. The model will start downloading. Once it's finished it will say "Done"
 5. In the top left, click the refresh icon next to **Model**.