hamel winglian commited on
Commit
7512c3a
1 Parent(s): 78c5b19

Add Debugging Guide (#1089)

Browse files

* add debug guide

* add background

* add .gitignore

* Update devtools/dev_sharegpt.yml

Co-authored-by: Wing Lian <wing.lian@gmail.com>

* Update docs/debugging.md

Co-authored-by: Wing Lian <wing.lian@gmail.com>

* simplify example axolotl config

* add additional comments

* add video and TOC

* try jsonc for better md rendering

* style video thumbnail better

* fix footnote

---------

Co-authored-by: Wing Lian <wing.lian@gmail.com>

.gitignore CHANGED
@@ -1,5 +1,7 @@
1
  **/axolotl.egg-info
2
  configs
 
 
3
 
4
  # Byte-compiled / optimized / DLL files
5
  __pycache__/
 
1
  **/axolotl.egg-info
2
  configs
3
+ last_run_prepared/
4
+ .vscode
5
 
6
  # Byte-compiled / optimized / DLL files
7
  __pycache__/
.vscode/README.md ADDED
@@ -0,0 +1 @@
 
 
1
+ See [docs/debugging.md](../docs/debugging.md) for guidance on how to modify these files to debug axolotl with VSCode.
.vscode/launch.json ADDED
@@ -0,0 +1,34 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ // Use IntelliSense to learn about possible attributes.
3
+ // Hover to view descriptions of existing attributes.
4
+ // For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
5
+ "version": "0.2.0",
6
+ "configurations": [
7
+ {
8
+ "name": "Debug axolotl prompt - sharegpt",
9
+ "type": "python",
10
+ "module": "accelerate.commands.launch",
11
+ "request": "launch",
12
+ "args": [
13
+ "-m", "axolotl.cli.train", "dev_sharegpt.yml",
14
+ // The flags below simplify debugging by overriding the axolotl config
15
+ // with the debugging tips above. Modify as needed.
16
+ "--dataset_processes=1", // limits data preprocessing to one process
17
+ "--max_steps=1", // limits training to just one step
18
+ "--batch_size=1", // minimizes batch size
19
+ "--micro_batch_size=1", // minimizes batch size
20
+ "--val_set_size=0", // disables validation
21
+ "--sample_packing=False", // disables sample packing which is necessary for small datasets
22
+ "--eval_sample_packing=False",// disables sample packing on eval set
23
+ "--dataset_prepared_path=temp_debug/axolotl_outputs/data", // send data outputs to a temp folder
24
+ "--output_dir=temp_debug/axolotl_outputs/model" // send model outputs to a temp folder
25
+ ],
26
+ "console": "integratedTerminal", // show output in the integrated terminal
27
+ "cwd": "${workspaceFolder}/devtools", // set working directory to devtools from the root of the project
28
+ "justMyCode": true, // step through only axolotl code
29
+ "env": {"CUDA_VISIBLE_DEVICES": "0", // Since we aren't doing distributed training, we need to limit to one GPU
30
+ "HF_HOME": "${workspaceFolder}/devtools/temp_debug/.hf-cache"}, // send HF cache to a temp folder
31
+ "preLaunchTask": "cleanup-for-dataprep", // delete temp folders (see below)
32
+ }
33
+ ]
34
+ }
.vscode/tasks.json ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ //this file is used by launch.json
2
+ {
3
+ "version": "2.0.0",
4
+ "tasks": [
5
+ // this task changes into the devtools directory and deletes the temp_debug/axolotl_outputs folder
6
+ {
7
+ "label": "delete-outputs",
8
+ "type": "shell",
9
+ "command": "rm -rf temp_debug/axolotl_outputs",
10
+ "options":{ "cwd": "${workspaceFolder}/devtools"},
11
+ "problemMatcher": []
12
+ },
13
+ // this task changes into the devtools directory and deletes the `temp_debug/.hf-cache/datasets` folder
14
+ {
15
+ "label": "delete-temp-hf-dataset-cache",
16
+ "type": "shell",
17
+ "command": "rm -rf temp_debug/.hf-cache/datasets",
18
+ "options":{ "cwd": "${workspaceFolder}/devtools"},
19
+ "problemMatcher": []
20
+ },
21
+ // this task combines the two tasks above
22
+ {
23
+ "label": "cleanup-for-dataprep",
24
+ "dependsOn": ["delete-outputs", "delete-temp-hf-dataset-cache"],
25
+ }
26
+ ]
27
+ }
README.md CHANGED
@@ -39,6 +39,7 @@ Features:
39
  - [Special Tokens](#special-tokens)
40
  - [Common Errors](#common-errors-)
41
  - [Tokenization Mismatch b/w Training & Inference](#tokenization-mismatch-bw-inference--training)
 
42
  - [Need Help?](#need-help-)
43
  - [Badge](#badge-)
44
  - [Community Showcase](#community-showcase)
@@ -1066,7 +1067,7 @@ although this will be very slow, and using the config options above are recommen
1066
 
1067
  ## Common Errors 🧰
1068
 
1069
- See also the [FAQ's](./docs/faq.md).
1070
 
1071
  > If you encounter a 'Cuda out of memory' error, it means your GPU ran out of memory during the training process. Here's how to resolve it:
1072
 
@@ -1116,6 +1117,10 @@ If you decode a prompt constructed by axolotl, you might see spaces between toke
1116
 
1117
  Having misalignment between your prompts during training and inference can cause models to perform very poorly, so it is worth checking this. See [this blog post](https://hamel.dev/notes/llm/05_tokenizer_gotchas.html) for a concrete example.
1118
 
 
 
 
 
1119
  ## Need help? 🙋♂️
1120
 
1121
  Join our [Discord server](https://discord.gg/HhrNrHJPRb) where we can help you
 
39
  - [Special Tokens](#special-tokens)
40
  - [Common Errors](#common-errors-)
41
  - [Tokenization Mismatch b/w Training & Inference](#tokenization-mismatch-bw-inference--training)
42
+ - [Debugging Axolotl](#debugging-axolotl)
43
  - [Need Help?](#need-help-)
44
  - [Badge](#badge-)
45
  - [Community Showcase](#community-showcase)
 
1067
 
1068
  ## Common Errors 🧰
1069
 
1070
+ See also the [FAQ's](./docs/faq.md) and [debugging guide](docs/debugging.md).
1071
 
1072
  > If you encounter a 'Cuda out of memory' error, it means your GPU ran out of memory during the training process. Here's how to resolve it:
1073
 
 
1117
 
1118
  Having misalignment between your prompts during training and inference can cause models to perform very poorly, so it is worth checking this. See [this blog post](https://hamel.dev/notes/llm/05_tokenizer_gotchas.html) for a concrete example.
1119
 
1120
+ ## Debugging Axolotl
1121
+
1122
+ See [this debugging guide](docs/debugging.md) for tips on debugging Axolotl, along with an example configuration for debugging with VSCode.
1123
+
1124
  ## Need help? 🙋♂️
1125
 
1126
  Join our [Discord server](https://discord.gg/HhrNrHJPRb) where we can help you
devtools/README.md ADDED
@@ -0,0 +1 @@
 
 
1
+ This directory contains example config files that might be useful for debugging. Please see [docs/debugging.md](../docs/debugging.md) for more information.
devtools/dev_sharegpt.yml ADDED
@@ -0,0 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Example config for debugging the sharegpt prompt format
2
+ base_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
3
+ model_type: LlamaForCausalLM
4
+ tokenizer_type: LlamaTokenizer
5
+ is_llama_derived_model: true
6
+
7
+ load_in_8bit: true
8
+ load_in_4bit: false
9
+
10
+ datasets:
11
+ - path: philschmid/guanaco-sharegpt-style
12
+ type: sharegpt
13
+ shards: 10
14
+ val_set_size: 0
15
+ output_dir: temp_debug/axolotl_outputs/model
16
+ dataset_prepared_path: temp_debug/axolotl_outputs/data
17
+ dataset_processes: 1
18
+
19
+ sequence_len: 4096
20
+ sample_packing: false
21
+ pad_to_sequence_len: true
22
+
23
+ adapter: lora
24
+ lora_model_dir:
25
+ lora_r: 32
26
+ lora_alpha: 16
27
+ lora_dropout: 0.05
28
+ lora_target_linear: true
29
+ lora_fan_in_fan_out:
30
+
31
+ micro_batch_size: 1
32
+ num_epochs: 1
33
+ max_steps: 10
34
+ optimizer: adamw_bnb_8bit
35
+ lr_scheduler: cosine
36
+ learning_rate: 0.0002
37
+
38
+ train_on_inputs: false
39
+ group_by_length: false
40
+ bf16: false
41
+ fp16: true
42
+ tf32: false
43
+
44
+ gradient_checkpointing: true
45
+ logging_steps: 1
46
+ flash_attention: true
47
+
48
+ warmup_steps: 10
49
+ weight_decay: 0.0
docs/debugging.md ADDED
@@ -0,0 +1,165 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Debugging Axolotl
2
+
3
+ This document provides some tips and tricks for debugging Axolotl. It also provides an example configuration for debugging with VSCode. A good debugging setup is essential to understanding how Axolotl code works behind the scenes.
4
+
5
+ ## Table of Contents
6
+
7
+ - [General Tips](#general-tips)
8
+ - [Debugging with VSCode](#debugging-with-vscode)
9
+ - [Background](#background)
10
+ - [Configuration](#configuration)
11
+ - [Customizing your debugger](#customizing-your-debugger)
12
+ - [Video Tutorial](#video-tutorial)
13
+
14
+ ## General Tips
15
+
16
+ While debugging it's helpful to simplify your test scenario as much as possible. Here are some tips for doing so:
17
+
18
+ > [!Important]
19
+ > All of these tips are incorporated into the [example configuration](#configuration) for debugging with VSCode below.
20
+
21
+ 1. **Eliminate Concurrency**: Restrict the number of processes to 1 for both training and data preprocessing:
22
+ - Set `CUDA_VISIBLE_DEVICES` to a single GPU, ex: `export CUDA_VISIBLE_DEVICES=0`.
23
+ - Set `dataset_processes: 1` in your axolotl config or run the training command with `--dataset_processes=1`.
24
+ 2. **Use a small dataset**: Construct or use a small dataset from HF Hub. When using a small dataset, you will often have to make sure `sample_packing: False` and `eval_sample_packing: False` to avoid errors. If you are in a pinch and don't have time to construct a small dataset but want to use from the HF Hub, you can shard the data (this will still tokenize the entire dataset, but will only use a fraction of the data for training. For example, to shard the dataset into 20 pieces, add the following to your axolotl config):
25
+ ```yaml
26
+ dataset:
27
+ ...
28
+ shards: 20
29
+ ```
30
+ 3. **Use a small model**: A good example of a small model is [TinyLlama/TinyLlama-1.1B-Chat-v1.0](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0).
31
+ 4. **Minimize iteration time**: Make sure the training loop finishes as fast as possible, with these settings.
32
+ - `micro_batch_size: 1`
33
+ - `max_steps: 1`
34
+ - `val_set_size: 0`
35
+ 5. **Clear Caches:** Axolotl caches certain steps and so does the underlying HuggingFace trainer. You may want to clear some of these caches when debugging.
36
+ - Data preprocessing: When debugging data preprocessing, which includes prompt template formation, you may want to delete the directory set in `dataset_prepared_path:` in your axolotl config. If you didn't set this value, the default is `last_run_prepared`.
37
+ - HF Hub: If you are debugging data preprocessing, you should clear the relevant HF cache [HuggingFace cache](https://huggingface.co/docs/datasets/cache), by deleting the appropriate `~/.cache/huggingface/datasets/...` folder(s).
38
+ - **The recommended approach is to redirect all outputs and caches to a temporary folder and delete selected subfolders before each run. This is demonstrated in the example configuration below.**
39
+
40
+
41
+ ## Debugging with VSCode
42
+
43
+ ### Background
44
+
45
+ The below example shows how to configure VSCode to debug data preprocessing of the `sharegpt` format. This is the format used when you have the following in your axolotl config:
46
+
47
+ ```yaml
48
+ datasets:
49
+ - path: <path to your sharegpt formatted dataset> # example on HF Hub: philschmid/guanaco-sharegpt-style
50
+ type: sharegpt
51
+ ```
52
+
53
+ >[!Important]
54
+ > If you are already familiar with advanced VSCode debugging, you can skip the below explanation and look at the files [.vscode/launch.json](../.vscode/launch.json) and [.vscode/tasks.json](../.vscode/tasks.json) for an example configuration.
55
+
56
+ >[!Tip]
57
+ > If you prefer to watch a video, rather than read, you can skip to the [video tutorial](#video-tutorial) below (but doing both is recommended).
58
+
59
+ ### Configuration
60
+
61
+ The easiest way to get started is to modify the [.vscode/launch.json](../.vscode/launch.json) file in this project. This is just an example configuration, so you may need to modify or copy it to suit your needs.
62
+
63
+ For example, to mimic the command `cd devtools && CUDA_VISIBLE_DEVICES=0 accelerate launch -m axolotl.cli.train dev_sharegpt.yml`, you would use the below configuration[^1]. Note that we add additional flags that override the axolotl config and incorporate the tips above (see the comments). We also set the working directory to `devtools` and set the `env` variable `HF_HOME` to a temporary folder that is later partially deleted. This is because we want to delete the HF dataset cache before each run in this particular
64
+
65
+ ```jsonc
66
+ // .vscode/launch.json
67
+ {
68
+ "version": "0.2.0",
69
+ "configurations": [
70
+ {
71
+ "name": "Debug axolotl prompt - sharegpt",
72
+ "type": "python",
73
+ "module": "accelerate.commands.launch",
74
+ "request": "launch",
75
+ "args": [
76
+ "-m", "axolotl.cli.train", "dev_sharegpt.yml",
77
+ // The flags below simplify debugging by overriding the axolotl config
78
+ // with the debugging tips above. Modify as needed.
79
+ "--dataset_processes=1", // limits data preprocessing to one process
80
+ "--max_steps=1", // limits training to just one step
81
+ "--batch_size=1", // minimizes batch size
82
+ "--micro_batch_size=1", // minimizes batch size
83
+ "--val_set_size=0", // disables validation
84
+ "--sample_packing=False", // disables sample packing which is necessary for small datasets
85
+ "--eval_sample_packing=False",// disables sample packing on eval set
86
+ "--dataset_prepared_path=temp_debug/axolotl_outputs/data", // send data outputs to a temp folder
87
+ "--output_dir=temp_debug/axolotl_outputs/model" // send model outputs to a temp folder
88
+ ],
89
+ "console": "integratedTerminal", // show output in the integrated terminal
90
+ "cwd": "${workspaceFolder}/devtools", // set working directory to devtools from the root of the project
91
+ "justMyCode": true, // step through only axolotl code
92
+ "env": {"CUDA_VISIBLE_DEVICES": "0", // Since we aren't doing distributed training, we need to limit to one GPU
93
+ "HF_HOME": "${workspaceFolder}/devtools/temp_debug/.hf-cache"}, // send HF cache to a temp folder
94
+ "preLaunchTask": "cleanup-for-dataprep", // delete temp folders (see below)
95
+ }
96
+ ]
97
+ }
98
+ ```
99
+
100
+ **Additional notes about this configuration:**
101
+
102
+ - The argument `justMyCode` is set to `true` such that you step through only the axolotl code. If you want to step into dependencies, set this to `false`.
103
+ - The `preLaunchTask`: `cleanup-for-dataprep` is defined in [.vscode/tasks.json](../.vscode/tasks.json) and is used to delete the following folders before debugging, which is essential to ensure that the data pre-processing code is run from scratch:
104
+ - `./devtools/temp_debug/axolotl_outputs`
105
+ - `./devtools/temp_debug/.hf-cache/datasets`
106
+
107
+ >[!Tip]
108
+ > You may not want to delete these folders. For example, if you are debugging model training instead of data pre-processing, you may NOT want to delete the cache or output folders. You may also need to add additional tasks to the `tasks.json` file depending on your use case.
109
+
110
+ Below is the [./vscode/tasks.json](../.vscode/tasks.json) file that defines the `cleanup-for-dataprep` task. This task is run before each debugging session when you use the above configuration. Note how there are two tasks that delete the two folders mentioned above. The third task `cleanup-for-dataprep` is a composite task that combines the two tasks. A composite task is necessary because VSCode does not allow you to specify multiple tasks in the `preLaunchTask` argument of the `launch.json` file.
111
+
112
+ ```jsonc
113
+ // .vscode/tasks.json
114
+ // this file is used by launch.json
115
+ {
116
+ "version": "2.0.0",
117
+ "tasks": [
118
+ // this task changes into the devtools directory and deletes the temp_debug/axolotl_outputs folder
119
+ {
120
+ "label": "delete-outputs",
121
+ "type": "shell",
122
+ "command": "rm -rf temp_debug/axolotl_outputs",
123
+ "options":{ "cwd": "${workspaceFolder}/devtools"},
124
+ "problemMatcher": []
125
+ },
126
+ // this task changes into the devtools directory and deletes the `temp_debug/.hf-cache/datasets` folder
127
+ {
128
+ "label": "delete-temp-hf-dataset-cache",
129
+ "type": "shell",
130
+ "command": "rm -rf temp_debug/.hf-cache/datasets",
131
+ "options":{ "cwd": "${workspaceFolder}/devtools"},
132
+ "problemMatcher": []
133
+ },
134
+ // this task combines the two tasks above
135
+ {
136
+ "label": "cleanup-for-dataprep",
137
+ "dependsOn": ["delete-outputs", "delete-temp-hf-dataset-cache"],
138
+ }
139
+ ]
140
+ }
141
+ ```
142
+
143
+ ### Customizing your debugger
144
+
145
+ Your debugging use case may differ from the example above. The easiest thing to do is to put your own axolotl config in the `devtools` folder and modify the `launch.json` file to use your config. You may also want to modify the `preLaunchTask` to delete different folders or not delete anything at all.
146
+
147
+ ### Video Tutorial
148
+
149
+ The following video tutorial walks through the above configuration and demonstrates how to debug with VSCode, (click the image below to watch):
150
+
151
+ <div style="text-align: center; line-height: 0;">
152
+
153
+ <a href="https://youtu.be/xUUB11yeMmc?si=z6Ea1BrRYkq6wsMx" target="_blank"
154
+ title="How to debug Axolotl (for fine tuning LLMs)"><img
155
+ src="https://i.ytimg.com/vi/xUUB11yeMmc/maxresdefault.jpg"
156
+ style="border-radius: 10px; display: block; margin: auto;" width="560" height="315" /></a>
157
+
158
+ <figcaption style="font-size: smaller;"><a href="https://hamel.dev">Hamel Husain's</a> tutorial: <a href="https://www.youtube.com/watch?v=xUUB11yeMmc">Debugging Axolotl w/VSCode</a></figcaption>
159
+
160
+ </div>
161
+ <br>
162
+
163
+
164
+
165
+ [^1]: The config actually mimics the command `CUDA_VISIBLE_DEVICES=0 python -m accelerate.commands.launch -m axolotl.cli.train devtools/sharegpt.yml`, but this is the same thing.