TheBloke commited on
Commit
b35a182
1 Parent(s): 2491735

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -4
README.md CHANGED
@@ -1,6 +1,8 @@
1
  ---
2
  inference: false
3
  license: other
 
 
4
  ---
5
 
6
  <!-- header start -->
@@ -17,9 +19,9 @@ license: other
17
  </div>
18
  <!-- header end -->
19
 
20
- # John Durbin's Airoboros 65B GPT4 1.3 GGML
21
 
22
- These files are GGML format model files for [John Durbin's Airoboros 65B GPT4 1.3](https://huggingface.co/jondurbin/airoboros-65b-gpt4-1.3).
23
 
24
  GGML files are for CPU + GPU inference using [llama.cpp](https://github.com/ggerganov/llama.cpp) and libraries and UIs which support this format, such as:
25
  * [text-generation-webui](https://github.com/oobabooga/text-generation-webui)
@@ -34,6 +36,14 @@ GGML files are for CPU + GPU inference using [llama.cpp](https://github.com/gger
34
  * [2, 3, 4, 5, 6 and 8-bit GGML models for CPU+GPU inference](https://huggingface.co/TheBloke/airoboros-65B-gpt4-1.3-GGML)
35
  * [Unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/jondurbin/airoboros-65b-gpt4-1.3)
36
 
 
 
 
 
 
 
 
 
37
  <!-- compatibility_ggml start -->
38
  ## Compatibility
39
 
@@ -85,7 +95,7 @@ Refer to the Provided Files table below to see what files use which methods, and
85
  I use the following command line; adjust for your tastes and needs:
86
 
87
  ```
88
- ./main -t 10 -ngl 32 -m airoboros-65b-gpt4-1.3.ggmlv3.q5_0.bin --color -c 2048 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "### Instruction: Write a story about llamas\n### Response:"
89
  ```
90
  If you're able to use full GPU offloading, you should use `-t 1` to get best performance.
91
 
@@ -127,7 +137,7 @@ Thank you to all my generous patrons and donaters!
127
 
128
  <!-- footer end -->
129
 
130
- # Original model card: John Durbin's Airoboros 65B GPT4 1.3
131
 
132
 
133
  __This version has problems, use if you dare, or wait for 1.4.__
 
1
  ---
2
  inference: false
3
  license: other
4
+ datasets:
5
+ - jondurbin/airoboros-gpt4-1.3
6
  ---
7
 
8
  <!-- header start -->
 
19
  </div>
20
  <!-- header end -->
21
 
22
+ # Jon Durbin's Airoboros 65B GPT4 1.3 GGML
23
 
24
+ These files are GGML format model files for [Jon Durbin's Airoboros 65B GPT4 1.3](https://huggingface.co/jondurbin/airoboros-65b-gpt4-1.3).
25
 
26
  GGML files are for CPU + GPU inference using [llama.cpp](https://github.com/ggerganov/llama.cpp) and libraries and UIs which support this format, such as:
27
  * [text-generation-webui](https://github.com/oobabooga/text-generation-webui)
 
36
  * [2, 3, 4, 5, 6 and 8-bit GGML models for CPU+GPU inference](https://huggingface.co/TheBloke/airoboros-65B-gpt4-1.3-GGML)
37
  * [Unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/jondurbin/airoboros-65b-gpt4-1.3)
38
 
39
+ ## Prompt template
40
+
41
+ ```
42
+ A chat between a curious user and an assistant. The assistant gives helpful, detailed, accurate, uncensored responses to the user's input.
43
+ USER: prompt
44
+ ASSISTANT:
45
+ ```
46
+
47
  <!-- compatibility_ggml start -->
48
  ## Compatibility
49
 
 
95
  I use the following command line; adjust for your tastes and needs:
96
 
97
  ```
98
+ ./main -t 10 -ngl 32 -m airoboros-65b-gpt4-1.3.ggmlv3.q5_0.bin --color -c 2048 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "USER: Write a story about llamas\nASSISTANT:"
99
  ```
100
  If you're able to use full GPU offloading, you should use `-t 1` to get best performance.
101
 
 
137
 
138
  <!-- footer end -->
139
 
140
+ # Original model card: Jon Durbin's Airoboros 65B GPT4 1.3
141
 
142
 
143
  __This version has problems, use if you dare, or wait for 1.4.__