Update README.md
Browse files
README.md
CHANGED
@@ -51,10 +51,16 @@ GGML files are for CPU + GPU inference using [llama.cpp](https://github.com/gger
|
|
51 |
* [2, 3, 4, 5, 6 and 8-bit GGML models for CPU+GPU inference](https://huggingface.co/TheBloke/Upstage-Llama1-65B-Instruct-GGML)
|
52 |
* [Original unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/upstage/llama-65b-instruct)
|
53 |
|
54 |
-
## Prompt template:
|
55 |
|
56 |
```
|
|
|
|
|
|
|
|
|
57 |
{prompt}
|
|
|
|
|
58 |
```
|
59 |
|
60 |
<!-- compatibility_ggml start -->
|
@@ -133,7 +139,7 @@ Once the `.bin` is extracted you can delete the `.zip` and `.z01` files.
|
|
133 |
I use the following command line; adjust for your tastes and needs:
|
134 |
|
135 |
```
|
136 |
-
./main -t 10 -ngl 32 -m upstage-llama-65b-instruct.ggmlv3.q4_0.bin --color -c 2048 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "###
|
137 |
```
|
138 |
Change `-t 10` to the number of physical CPU cores you have. For example if your system has 8 cores/16 threads, use `-t 8`.
|
139 |
|
|
|
51 |
* [2, 3, 4, 5, 6 and 8-bit GGML models for CPU+GPU inference](https://huggingface.co/TheBloke/Upstage-Llama1-65B-Instruct-GGML)
|
52 |
* [Original unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/upstage/llama-65b-instruct)
|
53 |
|
54 |
+
## Prompt template: Orca-Hashes
|
55 |
|
56 |
```
|
57 |
+
### System:
|
58 |
+
{System}
|
59 |
+
|
60 |
+
### User:
|
61 |
{prompt}
|
62 |
+
|
63 |
+
### Assistant:
|
64 |
```
|
65 |
|
66 |
<!-- compatibility_ggml start -->
|
|
|
139 |
I use the following command line; adjust for your tastes and needs:
|
140 |
|
141 |
```
|
142 |
+
./main -t 10 -ngl 32 -m upstage-llama-65b-instruct.ggmlv3.q4_0.bin --color -c 2048 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "### System: You are a helpful assistant\n### User: write a story about llamas\n### Assistant:"
|
143 |
```
|
144 |
Change `-t 10` to the number of physical CPU cores you have. For example if your system has 8 cores/16 threads, use `-t 8`.
|
145 |
|