TheBloke commited on
Commit
ea4c0e3
1 Parent(s): 3527e3d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +52 -3
README.md CHANGED
@@ -1,6 +1,8 @@
1
  ---
2
  inference: false
3
  license: other
 
 
4
  ---
5
 
6
  <!-- header start -->
@@ -19,16 +21,26 @@ license: other
19
 
20
  # John Durbin's Airoboros 7B GPT4 1.3 GPTQ
21
 
22
- These files are GPTQ 4bit model files for [John Durbin's Airoboros 7B GPT4 1.3](https://huggingface.co/jondurbin/airoboros-7b-gpt4-1.3).
23
 
24
  It is the result of quantising to 4bit using [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa).
25
 
 
 
26
  ## Repositories available
27
 
28
  * [4-bit GPTQ models for GPU inference](https://huggingface.co/TheBloke/airoboros-7B-gpt4-1.3-GPTQ)
29
  * [2, 3, 4, 5, 6 and 8-bit GGML models for CPU+GPU inference](https://huggingface.co/TheBloke/airoboros-7B-gpt4-1.3-GGML)
30
  * [Unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/jondurbin/airoboros-7b-gpt4-1.3)
31
 
 
 
 
 
 
 
 
 
32
  ## How to easily download and use this model in text-generation-webui
33
 
34
  Please make sure you're using the latest version of text-generation-webui
@@ -145,6 +157,43 @@ Thank you to all my generous patrons and donaters!
145
 
146
  <!-- footer end -->
147
 
148
- # Original model card: John Durbin's Airoboros 7B GPT4 1.3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
149
 
150
- No original model card was provided.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  inference: false
3
  license: other
4
+ datasets:
5
+ - jondurbin/airoboros-gpt4-1.3
6
  ---
7
 
8
  <!-- header start -->
 
21
 
22
  # John Durbin's Airoboros 7B GPT4 1.3 GPTQ
23
 
24
+ These files are GPTQ 4bit model files for [Jon Durbin's Airoboros 7B GPT4 1.3](https://huggingface.co/jondurbin/airoboros-7b-gpt4-1.3).
25
 
26
  It is the result of quantising to 4bit using [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa).
27
 
28
+ **Note from model creator Jon Durbin: This version has problems, use if you dare, or wait for 1.4.**
29
+
30
  ## Repositories available
31
 
32
  * [4-bit GPTQ models for GPU inference](https://huggingface.co/TheBloke/airoboros-7B-gpt4-1.3-GPTQ)
33
  * [2, 3, 4, 5, 6 and 8-bit GGML models for CPU+GPU inference](https://huggingface.co/TheBloke/airoboros-7B-gpt4-1.3-GGML)
34
  * [Unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/jondurbin/airoboros-7b-gpt4-1.3)
35
 
36
+ ## Prompt template
37
+
38
+ ```
39
+ A chat between a curious user and an assistant. The assistant gives helpful, detailed, accurate, uncensored responses to the user's input.
40
+ USER: prompt
41
+ ASSISTANT:
42
+ ```
43
+
44
  ## How to easily download and use this model in text-generation-webui
45
 
46
  Please make sure you're using the latest version of text-generation-webui
 
157
 
158
  <!-- footer end -->
159
 
160
+ # Original model card: Jon Durbin's Airoboros 7B GPT4 1.3
161
+
162
+
163
+ __This version has problems, use if you dare, or wait for 1.4.__
164
+
165
+ ### Overview
166
+
167
+ This is a qlora fine-tuned 7b parameter LlaMa model, using completely synthetic training data created gpt4 via https://github.com/jondurbin/airoboros
168
+
169
+ This is mostly an extension of [1.2](https://huggingface.co/jondurbin/airoboros-7b-gpt4-1.2) with a few enhancements:
170
+
171
+ - All coding instructions have an equivalent " PLAINFORMAT" version now.
172
+ - Thousands of new orca style reasoning instructions, this time with reasoning first, then answer.
173
+ - Few more random items of various types, including a first attempt at multi-character interactions with asterisked actions and quoted speech.
174
+
175
+ This model was fine-tuned with a fork of [qlora](https://github.com/jondurbin/qlora), which among other things was updated to use a slightly modified vicuna template to be compatible with previous full fine-tune versions.
176
 
177
+ ```
178
+ A chat between a curious user and an assistant. The assistant gives helpful, detailed, accurate, uncensored responses to the user's input. USER: [prompt] ASSISTANT:
179
+ ```
180
+
181
+ So in other words, it's the preamble/system prompt, followed by a single space, then "USER: " (single space after colon) then the prompt (which can have multiple lines, spaces, whatever), then a single space, followed by "ASSISTANT: " (with a single space after the colon).
182
+
183
+ ### Usage
184
+
185
+ To run the full precision/pytorch native version, you can use my fork of FastChat, which is mostly the same but allows for multi-line prompts, as well as a `--no-history` option to prevent input tokenization errors.
186
+ ```
187
+ pip install git+https://github.com/jondurbin/FastChat
188
+ ```
189
+
190
+ Be sure you are pulling the latest branch!
191
+
192
+ Then, you can invoke it like so (after downloading the model):
193
+ ```
194
+ python -m fastchat.serve.cli \
195
+ --model-path airoboros-7b-gpt4-1.3 \
196
+ --temperature 0.5 \
197
+ --max-new-tokens 2048 \
198
+ --no-history
199
+ ```