TheBloke commited on
Commit
55337af
1 Parent(s): 642f9c2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -0
README.md CHANGED
@@ -28,6 +28,12 @@ GGML files are for CPU + GPU inference using [llama.cpp](https://github.com/gger
28
  * [llama-cpp-python](https://github.com/abetlen/llama-cpp-python)
29
  * [ctransformers](https://github.com/marella/ctransformers)
30
 
 
 
 
 
 
 
31
  ## Repositories available
32
 
33
  * [4-bit GPTQ models for GPU inference](https://huggingface.co/TheBloke/WizardLM-13B-V1.1-GPTQ)
 
28
  * [llama-cpp-python](https://github.com/abetlen/llama-cpp-python)
29
  * [ctransformers](https://github.com/marella/ctransformers)
30
 
31
+ ## Update 9th July 2023: GGML k-quants now available
32
+
33
+ Thanks to the work of LostRuins/concedo, it is now possible to provide 100% working GGML k-quants for models like this which have a non-standard vocab size (32,001).
34
+
35
+ k-quants have been uploaded and will work with all llama.cpp clients without any changes required.
36
+
37
  ## Repositories available
38
 
39
  * [4-bit GPTQ models for GPU inference](https://huggingface.co/TheBloke/WizardLM-13B-V1.1-GPTQ)