MaziyarPanahi
/

WizardLM-2-8x22B-GGUF

Text Generation

4-bit precision

8-bit precision

arxiv:2304.12244

arxiv:2306.08568

arxiv:2308.09583

Inference Endpoints

text-generation-inference

Model card Files Files and versions Community

MaziyarPanahi commited on Apr 15

Commit

ab4e39d

•

1 Parent(s): 53a7052

Update README.md

Files changed (1) hide show

README.md +6 -1

README.md CHANGED Viewed

@@ -43,6 +43,12 @@ You can download only the quants you need instead of cloning the entire reposito
 huggingface-cli download MaziyarPanahi/WizardLM-2-8x22B-GGUF --local-dir . --include '*Q2_K*gguf'
 ```
 ## Load sharded model
 `llama_load_model_from_file` will detect the number of files and will load additional tensors from the rest of files.
@@ -51,7 +57,6 @@ huggingface-cli download MaziyarPanahi/WizardLM-2-8x22B-GGUF --local-dir . --inc
 llama.cpp/main -m WizardLM-2-8x22B.Q2_K-00001-of-00005.gguf -p "Building a website can be done in 10 simple steps:\nStep 1:" -n 1024 -e
 ```
 ## Prompt template
 ```

 huggingface-cli download MaziyarPanahi/WizardLM-2-8x22B-GGUF --local-dir . --include '*Q2_K*gguf'
 ```
+On Windows:
+```sh
+huggingface-cli download MaziyarPanahi/WizardLM-2-8x22B-GGUF --local-dir . --include *Q4_K_S*gguf
+```
 ## Load sharded model
 `llama_load_model_from_file` will detect the number of files and will load additional tensors from the rest of files.
 llama.cpp/main -m WizardLM-2-8x22B.Q2_K-00001-of-00005.gguf -p "Building a website can be done in 10 simple steps:\nStep 1:" -n 1024 -e
 ```
 ## Prompt template
 ```