TheBloke
/

Xwin-LM-70B-V0.1-GGUF

text-generation-inference

Model card Files Files and versions Community

TheBloke commited on Sep 22, 2023

Commit

0c23a00

•

1 Parent(s): 2ceade0

Upload README.md

Files changed (1) hide show

README.md +6 -18

README.md CHANGED Viewed

@@ -5,16 +5,9 @@ license: llama2
 model_creator: Xwin-LM
 model_name: Xwin-LM 70B V0.1
 model_type: llama
-prompt_template: 'Below is an instruction that describes a task. Write a response
-  that appropriately completes the request.
-  ### Instruction:
-  {prompt}
-  ### Response:
   '
 quantized_by: TheBloke
@@ -75,15 +68,10 @@ Here is an incomplate list of clients and libraries that are known to support GG
 <!-- repositories-available end -->
 <!-- prompt-template start -->
-## Prompt template: Alpaca
 ```
-Below is an instruction that describes a task. Write a response that appropriately completes the request.
-### Instruction:
-{prompt}
-### Response:
 ```
@@ -231,7 +219,7 @@ Windows Command Line users: You can set the environment variable by running `set
 Make sure you are using `llama.cpp` from commit [d0cee0d](https://github.com/ggerganov/llama.cpp/commit/d0cee0d36d5be95a0d9088b674dbb27354107221) or later.
 ```shell
-./main -ngl 32 -m xwin-lm-70b-v0.1.Q4_K_M.gguf --color -c 4096 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\n{prompt}\n\n### Response:"
 ```
 Change `-ngl 32` to the number of layers to offload to GPU. Remove it if you don't have GPU acceleration.

 model_creator: Xwin-LM
 model_name: Xwin-LM 70B V0.1
 model_type: llama
+prompt_template: 'A chat between a curious user and an artificial intelligence assistant.
+  The assistant gives helpful, detailed, and polite answers to the user''s questions.
+  USER: {prompt} ASSISTANT:
   '
 quantized_by: TheBloke
 <!-- repositories-available end -->
 <!-- prompt-template start -->
+## Prompt template: Vicuna
 ```
+A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: {prompt} ASSISTANT:
 ```
 Make sure you are using `llama.cpp` from commit [d0cee0d](https://github.com/ggerganov/llama.cpp/commit/d0cee0d36d5be95a0d9088b674dbb27354107221) or later.
 ```shell
+./main -ngl 32 -m xwin-lm-70b-v0.1.Q4_K_M.gguf --color -c 4096 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: {prompt} ASSISTANT:"
 ```
 Change `-ngl 32` to the number of layers to offload to GPU. Remove it if you don't have GPU acceleration.