MaziyarPanahi
/

WizardLM-2-8x22B-GGUF

Text Generation

4-bit precision

8-bit precision

arxiv:2304.12244

arxiv:2306.08568

arxiv:2308.09583

Inference Endpoints

text-generation-inference

Model card Files Files and versions Community

add example how to download splits for 1 quant

#6

by MaziyarPanahi - opened Apr 15

base: refs/heads/main

←

from: refs/pr/6

Discussion Files changed

Files changed (1) hide show

README.md +8 -0

README.md CHANGED Viewed

@@ -35,6 +35,14 @@ quantized_by: MaziyarPanahi
 ## Description
 [MaziyarPanahi/WizardLM-2-8x22B-GGUF](https://huggingface.co/MaziyarPanahi/WizardLM-2-8x22B-GGUF) contains GGUF format model files for [microsoft/WizardLM-2-8x22B](https://huggingface.co/microsoft/WizardLM-2-8x22B).
 ## Load sharded model
 `llama_load_model_from_file` will detect the number of files and will load additional tensors from the rest of files.

 ## Description
 [MaziyarPanahi/WizardLM-2-8x22B-GGUF](https://huggingface.co/MaziyarPanahi/WizardLM-2-8x22B-GGUF) contains GGUF format model files for [microsoft/WizardLM-2-8x22B](https://huggingface.co/microsoft/WizardLM-2-8x22B).
+## How to download
+You can download only the quants you need instead of cloning the entire repository as follows:
+```
+huggingface-cli download MaziyarPanahi/WizardLM-2-8x22B-GGUF --local-dir . --include '*Q2_K*gguf'
+```
 ## Load sharded model
 `llama_load_model_from_file` will detect the number of files and will load additional tensors from the rest of files.