casperhansen
commited on
Commit
•
e1b214c
1
Parent(s):
3553172
Clarify custom format example (#729)
Browse files* Clarify custom prompt format
* Simplify format
README.md
CHANGED
@@ -297,25 +297,24 @@ Have dataset(s) in one of the following format (JSONL recommended):
|
|
297 |
|
298 |
#### How to add custom prompts
|
299 |
|
300 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
301 |
```yaml
|
302 |
datasets:
|
303 |
- path: repo
|
304 |
type:
|
305 |
system_prompt: ""
|
306 |
-
|
307 |
-
|
308 |
-
|
309 |
-
format: |-
|
310 |
-
User: {instruction}
|
311 |
-
{input}<|end_of_turn|>
|
312 |
-
Assistant:
|
313 |
```
|
314 |
|
315 |
-
Using file:
|
316 |
-
1. Add your method to a file in [prompt_strategies](src/axolotl/prompt_strategies). Please see other files as example.
|
317 |
-
2. Use your custom file name as the dataset type `<prompt_strategies_file>.load_<load_fn>`.
|
318 |
-
|
319 |
#### How to use your custom pretokenized dataset
|
320 |
|
321 |
- Do not pass a `type:`
|
|
|
297 |
|
298 |
#### How to add custom prompts
|
299 |
|
300 |
+
For a dataset that is preprocessed for instruction purposes:
|
301 |
+
|
302 |
+
```json
|
303 |
+
{"instruction": "...", "output": "..."}
|
304 |
+
```
|
305 |
+
|
306 |
+
You can use this example in your YAML config:
|
307 |
+
|
308 |
```yaml
|
309 |
datasets:
|
310 |
- path: repo
|
311 |
type:
|
312 |
system_prompt: ""
|
313 |
+
field_system: system
|
314 |
+
format: "[INST] {instruction} [/INST]"
|
315 |
+
no_input_format: "[INST] {instruction} [/INST]"
|
|
|
|
|
|
|
|
|
316 |
```
|
317 |
|
|
|
|
|
|
|
|
|
318 |
#### How to use your custom pretokenized dataset
|
319 |
|
320 |
- Do not pass a `type:`
|