TheBloke commited on
Commit
630faad
1 Parent(s): 49088b9

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +70 -0
README.md CHANGED
@@ -5,6 +5,18 @@ license: apache-2.0
5
  model_creator: Ziqing Yang
6
  model_name: Chinese Llama 2 7B
7
  model_type: llama
 
 
 
 
 
 
 
 
 
 
 
 
8
  quantized_by: TheBloke
9
  ---
10
 
@@ -56,6 +68,7 @@ Here is an incomplate list of clients and libraries that are known to support GG
56
  <!-- repositories-available start -->
57
  ## Repositories available
58
 
 
59
  * [GPTQ models for GPU inference, with multiple quantisation parameter options.](https://huggingface.co/TheBloke/Chinese-Llama-2-7B-GPTQ)
60
  * [2, 3, 4, 5, 6 and 8-bit GGUF models for CPU+GPU inference](https://huggingface.co/TheBloke/Chinese-Llama-2-7B-GGUF)
61
  * [Ziqing Yang's original unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/ziqingyang/chinese-llama-2-7b)
@@ -130,6 +143,63 @@ Refer to the Provided Files table below to see what files use which methods, and
130
 
131
  <!-- README_GGUF.md-provided-files end -->
132
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
133
  <!-- README_GGUF.md-how-to-run start -->
134
  ## Example `llama.cpp` command
135
 
 
5
  model_creator: Ziqing Yang
6
  model_name: Chinese Llama 2 7B
7
  model_type: llama
8
+ prompt_template: 'Below is an instruction that describes a task. Write a response
9
+ that appropriately completes the request.
10
+
11
+
12
+ ### Instruction:
13
+
14
+ {prompt}
15
+
16
+
17
+ ### Response:
18
+
19
+ '
20
  quantized_by: TheBloke
21
  ---
22
 
 
68
  <!-- repositories-available start -->
69
  ## Repositories available
70
 
71
+ * [AWQ model(s) for GPU inference.](https://huggingface.co/TheBloke/Chinese-Llama-2-7B-AWQ)
72
  * [GPTQ models for GPU inference, with multiple quantisation parameter options.](https://huggingface.co/TheBloke/Chinese-Llama-2-7B-GPTQ)
73
  * [2, 3, 4, 5, 6 and 8-bit GGUF models for CPU+GPU inference](https://huggingface.co/TheBloke/Chinese-Llama-2-7B-GGUF)
74
  * [Ziqing Yang's original unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/ziqingyang/chinese-llama-2-7b)
 
143
 
144
  <!-- README_GGUF.md-provided-files end -->
145
 
146
+ <!-- README_GGUF.md-how-to-download start -->
147
+ ## How to download GGUF files
148
+
149
+ **Note for manual downloaders:** You almost never want to clone the entire repo! Multiple different quantisation formats are provided, and most users only want to pick and download a single file.
150
+
151
+ The following clients/libraries will automatically download models for you, providing a list of available models to choose from:
152
+ - LM Studio
153
+ - LoLLMS Web UI
154
+ - Faraday.dev
155
+
156
+ ### In `text-generation-webui`
157
+
158
+ Under Download Model, you can enter the model repo: TheBloke/Chinese-Llama-2-7B-GGUF and below it, a specific filename to download, such as: chinese-llama-2-7b.q4_K_M.gguf.
159
+
160
+ Then click Download.
161
+
162
+ ### On the command line, including multiple files at once
163
+
164
+ I recommend using the `huggingface-hub` Python library:
165
+
166
+ ```shell
167
+ pip3 install huggingface-hub>=0.17.1
168
+ ```
169
+
170
+ Then you can download any individual model file to the current directory, at high speed, with a command like this:
171
+
172
+ ```shell
173
+ huggingface-cli download TheBloke/Chinese-Llama-2-7B-GGUF chinese-llama-2-7b.q4_K_M.gguf --local-dir . --local-dir-use-symlinks False
174
+ ```
175
+
176
+ <details>
177
+ <summary>More advanced huggingface-cli download usage</summary>
178
+
179
+ You can also download multiple files at once with a pattern:
180
+
181
+ ```shell
182
+ huggingface-cli download TheBloke/Chinese-Llama-2-7B-GGUF --local-dir . --local-dir-use-symlinks False --include='*Q4_K*gguf'
183
+ ```
184
+
185
+ For more documentation on downloading with `huggingface-cli`, please see: [HF -> Hub Python Library -> Download files -> Download from the CLI](https://huggingface.co/docs/huggingface_hub/guides/download#download-from-the-cli).
186
+
187
+ To accelerate downloads on fast connections (1Gbit/s or higher), install `hf_transfer`:
188
+
189
+ ```shell
190
+ pip3 install hf_transfer
191
+ ```
192
+
193
+ And set environment variable `HF_HUB_ENABLE_HF_TRANSFER` to `1`:
194
+
195
+ ```shell
196
+ HUGGINGFACE_HUB_ENABLE_HF_TRANSFER=1 huggingface-cli download TheBloke/Chinese-Llama-2-7B-GGUF chinese-llama-2-7b.q4_K_M.gguf --local-dir . --local-dir-use-symlinks False
197
+ ```
198
+
199
+ Windows CLI users: Use `set HUGGINGFACE_HUB_ENABLE_HF_TRANSFER=1` before running the download command.
200
+ </details>
201
+ <!-- README_GGUF.md-how-to-download end -->
202
+
203
  <!-- README_GGUF.md-how-to-run start -->
204
  ## Example `llama.cpp` command
205