TheBloke commited on
Commit
dcba301
1 Parent(s): afc6fb6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -3
README.md CHANGED
@@ -1,6 +1,14 @@
1
  ---
2
  inference: false
3
- license: other
 
 
 
 
 
 
 
 
4
  ---
5
 
6
  <!-- header start -->
@@ -42,7 +50,9 @@ Below is an instruction that describes a task. Write a response that appropriate
42
 
43
  ## How to easily download and use this model in text-generation-webui
44
 
45
- Please make sure you're using the latest version of text-generation-webui
 
 
46
 
47
  1. Click the **Model tab**.
48
  2. Under **Download custom model or LoRA**, enter `TheBloke/Redmond-Hermes-Coder-GPTQ`.
@@ -129,7 +139,7 @@ It was created with group_size 128 to increase inference accuracy, but without -
129
 
130
  * `gptq_model-4bit-128g.safetensors`
131
  * Works with AutoGPTQ in CUDA or Triton modes.
132
- * [ExLlama](https://github.com/turboderp/exllama) suupports Llama 4-bit GPTQs, and will provide 2x speedup over AutoGPTQ and GPTQ-for-LLaMa.
133
  * Works with GPTQ-for-LLaMa in CUDA mode. May have issues with GPTQ-for-LLaMa Triton mode.
134
  * Works with text-generation-webui, including one-click-installers.
135
  * Parameters: Groupsize = 128. Act Order / desc_act = False.
 
1
  ---
2
  inference: false
3
+ license: gpl
4
+ language:
5
+ - en
6
+ tags:
7
+ - starcoder
8
+ - wizardcoder
9
+ - code
10
+ - self-instruct
11
+ - distillation
12
  ---
13
 
14
  <!-- header start -->
 
50
 
51
  ## How to easily download and use this model in text-generation-webui
52
 
53
+ Please make sure you're using the latest version of text-generation-webui.
54
+
55
+ Note: this is a non-Llama model which cannot be used with ExLlama. Use Loader: AutoGPTQ.
56
 
57
  1. Click the **Model tab**.
58
  2. Under **Download custom model or LoRA**, enter `TheBloke/Redmond-Hermes-Coder-GPTQ`.
 
139
 
140
  * `gptq_model-4bit-128g.safetensors`
141
  * Works with AutoGPTQ in CUDA or Triton modes.
142
+ * Does NOT work with [ExLlama](https://github.com/turboderp/exllama) as it's not a Llama model.
143
  * Works with GPTQ-for-LLaMa in CUDA mode. May have issues with GPTQ-for-LLaMa Triton mode.
144
  * Works with text-generation-webui, including one-click-installers.
145
  * Parameters: Groupsize = 128. Act Order / desc_act = False.