TheBloke
/

Chinese-Llama-2-7B-GGUF

Transformers

GGUF

llama

text-generation-inference

Model card Files Files and versions Community

TheBloke commited on Sep 19, 2023

Commit

630faad

•

1 Parent(s): 49088b9

Upload README.md

Browse files

Files changed (1) hide show

README.md +70 -0

README.md CHANGED Viewed

@@ -5,6 +5,18 @@ license: apache-2.0
 model_creator: Ziqing Yang
 model_name: Chinese Llama 2 7B
 model_type: llama
 quantized_by: TheBloke
 ---
@@ -56,6 +68,7 @@ Here is an incomplate list of clients and libraries that are known to support GG
 <!-- repositories-available start -->
 ## Repositories available
 * [GPTQ models for GPU inference, with multiple quantisation parameter options.](https://huggingface.co/TheBloke/Chinese-Llama-2-7B-GPTQ)
 * [2, 3, 4, 5, 6 and 8-bit GGUF models for CPU+GPU inference](https://huggingface.co/TheBloke/Chinese-Llama-2-7B-GGUF)
 * [Ziqing Yang's original unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/ziqingyang/chinese-llama-2-7b)
@@ -130,6 +143,63 @@ Refer to the Provided Files table below to see what files use which methods, and
 <!-- README_GGUF.md-provided-files end -->
 <!-- README_GGUF.md-how-to-run start -->
 ## Example `llama.cpp` command

 model_creator: Ziqing Yang
 model_name: Chinese Llama 2 7B
 model_type: llama
+prompt_template: 'Below is an instruction that describes a task. Write a response
+  that appropriately completes the request.
+  ### Instruction:
+  {prompt}
+  ### Response:
+  '
 quantized_by: TheBloke
 ---
 <!-- repositories-available start -->
 ## Repositories available
+* [AWQ model(s) for GPU inference.](https://huggingface.co/TheBloke/Chinese-Llama-2-7B-AWQ)
 * [GPTQ models for GPU inference, with multiple quantisation parameter options.](https://huggingface.co/TheBloke/Chinese-Llama-2-7B-GPTQ)
 * [2, 3, 4, 5, 6 and 8-bit GGUF models for CPU+GPU inference](https://huggingface.co/TheBloke/Chinese-Llama-2-7B-GGUF)
 * [Ziqing Yang's original unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/ziqingyang/chinese-llama-2-7b)
 <!-- README_GGUF.md-provided-files end -->
+<!-- README_GGUF.md-how-to-download start -->
+## How to download GGUF files
+**Note for manual downloaders:** You almost never want to clone the entire repo! Multiple different quantisation formats are provided, and most users only want to pick and download a single file.
+The following clients/libraries will automatically download models for you, providing a list of available models to choose from:
+- LM Studio
+- LoLLMS Web UI
+- Faraday.dev
+### In `text-generation-webui`
+Under Download Model, you can enter the model repo: TheBloke/Chinese-Llama-2-7B-GGUF and below it, a specific filename to download, such as: chinese-llama-2-7b.q4_K_M.gguf.
+Then click Download.
+### On the command line, including multiple files at once
+I recommend using the `huggingface-hub` Python library:
+```shell
+pip3 install huggingface-hub>=0.17.1
+```
+Then you can download any individual model file to the current directory, at high speed, with a command like this:
+```shell
+huggingface-cli download TheBloke/Chinese-Llama-2-7B-GGUF chinese-llama-2-7b.q4_K_M.gguf --local-dir . --local-dir-use-symlinks False
+```
+<details>
+  <summary>More advanced huggingface-cli download usage</summary>
+You can also download multiple files at once with a pattern:
+```shell
+huggingface-cli download TheBloke/Chinese-Llama-2-7B-GGUF --local-dir . --local-dir-use-symlinks False --include='*Q4_K*gguf'
+```
+For more documentation on downloading with `huggingface-cli`, please see: [HF -> Hub Python Library -> Download files -> Download from the CLI](https://huggingface.co/docs/huggingface_hub/guides/download#download-from-the-cli).
+To accelerate downloads on fast connections (1Gbit/s or higher), install `hf_transfer`:
+```shell
+pip3 install hf_transfer
+```
+And set environment variable `HF_HUB_ENABLE_HF_TRANSFER` to `1`:
+```shell
+HUGGINGFACE_HUB_ENABLE_HF_TRANSFER=1 huggingface-cli download TheBloke/Chinese-Llama-2-7B-GGUF chinese-llama-2-7b.q4_K_M.gguf --local-dir . --local-dir-use-symlinks False
+```
+Windows CLI users: Use `set HUGGINGFACE_HUB_ENABLE_HF_TRANSFER=1` before running the download command.
+</details>
+<!-- README_GGUF.md-how-to-download end -->
 <!-- README_GGUF.md-how-to-run start -->
 ## Example `llama.cpp` command