Transformers
llama
text-generation-inference
WizardLM commited on
Commit
d5856bb
1 Parent(s): 5a58ba3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -5
README.md CHANGED
@@ -44,17 +44,22 @@ GPU acceleration is now available for Llama 2 70B GGML files, with both CUDA (NV
44
  * [2, 3, 4, 5, 6 and 8-bit GGML models for CPU+GPU inference](https://huggingface.co/TheBloke/WizardMath-70B-V1.0-GGML)
45
  * [WizardLM's original unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/WizardLM/WizardMath-70B-V1.0)
46
 
47
- ## Prompt template: Alpaca-CoT
 
 
 
 
48
 
49
  ```
50
- Below is an instruction that describes a task. Write a response that appropriately completes the request.
 
51
 
52
 
53
- ### Instruction:
54
- {prompt}
55
 
56
 
57
- ### Response: Let's think step by step.
 
58
  ```
59
 
60
  <!-- compatibility_ggml start -->
 
44
  * [2, 3, 4, 5, 6 and 8-bit GGML models for CPU+GPU inference](https://huggingface.co/TheBloke/WizardMath-70B-V1.0-GGML)
45
  * [WizardLM's original unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/WizardLM/WizardMath-70B-V1.0)
46
 
47
+ ## Prompt template:
48
+
49
+ ❗<b>Note for model system prompts usage:</b>
50
+
51
+ **Default version:**
52
 
53
  ```
54
+ "Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\n{instruction}\n\n### Response:"
55
+ ```
56
 
57
 
58
+ **CoT Version:** (❗For the **simple** math questions, we do NOT recommend to use the CoT prompt.)
 
59
 
60
 
61
+ ```
62
+ "Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\n{instruction}\n\n### Response: Let's think step by step."
63
  ```
64
 
65
  <!-- compatibility_ggml start -->