TheBloke commited on
Commit
d7c8ef0
1 Parent(s): 0f75c61

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -5
README.md CHANGED
@@ -45,18 +45,18 @@ Now that we have ExLlama, that is the recommended loader to use for these models
45
 
46
  Reminder: ExLlama does not support 3-bit models, so if you wish to try those quants, you will need to use AutoGPTQ or GPTQ-for-LLaMa.
47
 
48
-
49
  ## AutoGPTQ and GPTQ-for-LLaMa requires latest version of Transformers
50
 
51
- If you plan to use any of these quants with AutoGPTQ or GPTQ-for-LLaMa, you will need to update Transformers to the latest Github code:
 
 
 
 
52
 
53
  ```
54
  pip3 install git+https://github.com/huggingface/transformers
55
  ```
56
 
57
- If using a UI like text-generation-webui, make sure to do this in the Python environment of text-generation-webui.
58
-
59
-
60
  ## Repositories available
61
 
62
  * [GPTQ models for GPU inference, with multiple quantisation parameter options.](https://huggingface.co/TheBloke/Llama-2-70B-chat-GPTQ)
 
45
 
46
  Reminder: ExLlama does not support 3-bit models, so if you wish to try those quants, you will need to use AutoGPTQ or GPTQ-for-LLaMa.
47
 
 
48
  ## AutoGPTQ and GPTQ-for-LLaMa requires latest version of Transformers
49
 
50
+ If you plan to use any of these quants with AutoGPTQ or GPTQ-for-LLaMa, your Transformers needs to be be using the latest Github code.
51
+
52
+ If you're using text-generation-webui and have updated to the latest version, this is done for you automatically.
53
+
54
+ If not, you can update it manually with:
55
 
56
  ```
57
  pip3 install git+https://github.com/huggingface/transformers
58
  ```
59
 
 
 
 
60
  ## Repositories available
61
 
62
  * [GPTQ models for GPU inference, with multiple quantisation parameter options.](https://huggingface.co/TheBloke/Llama-2-70B-chat-GPTQ)