hamel commited on
Commit
629450c
β€’
1 Parent(s): 2a1589f

Bootstrap Hosted Axolotl Docs w/Quarto (#1429)

Browse files

* precommit

* mv styes.css

* fix links

.github/workflows/docs.yml ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: Publish Docs
2
+ on:
3
+ push:
4
+ branches:
5
+ - main
6
+
7
+ permissions:
8
+ contents: write
9
+ pages: write
10
+
11
+ jobs:
12
+ build-deploy:
13
+ runs-on: ubuntu-latest
14
+ steps:
15
+ - name: Check out repository
16
+ uses: actions/checkout@v4
17
+ - name: Set up Quarto
18
+ uses: quarto-dev/quarto-actions/setup@v2
19
+ - name: Setup Python
20
+ uses: actions/setup-python@v3
21
+ with:
22
+ python-version: '3.10'
23
+ - name: Publish to GitHub Pages (and render)
24
+ uses: quarto-dev/quarto-actions/publish@v2
25
+ with:
26
+ target: gh-pages
27
+ env:
28
+ GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
.gitignore CHANGED
@@ -2,6 +2,7 @@
2
  configs
3
  last_run_prepared/
4
  .vscode
 
5
 
6
  # Byte-compiled / optimized / DLL files
7
  __pycache__/
@@ -172,3 +173,5 @@ wandb
172
  lora-out/*
173
  qlora-out/*
174
  mlruns/*
 
 
 
2
  configs
3
  last_run_prepared/
4
  .vscode
5
+ _site/
6
 
7
  # Byte-compiled / optimized / DLL files
8
  __pycache__/
 
173
  lora-out/*
174
  qlora-out/*
175
  mlruns/*
176
+
177
+ /.quarto/
README.md CHANGED
@@ -149,7 +149,7 @@ accelerate launch -m axolotl.cli.train https://raw.githubusercontent.com/OpenAcc
149
  ```
150
 
151
  >[!Tip]
152
- > If you want to debug axolotl or prefer to use Docker as your development environment, see the [debugging guide's section on Docker](docs/debugging.md#debugging-with-docker).
153
 
154
  <details>
155
 
@@ -267,7 +267,7 @@ Use the below instead of the install method in QuickStart.
267
  ```
268
  pip3 install -e '.'
269
  ```
270
- More info: [mac.md](/docs/mac.md)
271
 
272
  #### Launching on public clouds via SkyPilot
273
  To launch on GPU instances (both on-demand and spot instances) on 7+ clouds (GCP, AWS, Azure, OCI, and more), you can use [SkyPilot](https://skypilot.readthedocs.io/en/latest/index.html):
@@ -409,7 +409,7 @@ pretraining_dataset: # hf path only
409
  {"segments": [{"label": true|false, "text": "..."}]}
410
  ```
411
 
412
- This is a special format that allows you to construct prompts without using templates. This is for advanced users who want more freedom with prompt construction. See [these docs](docs/input_output.md) for more details.
413
 
414
  ##### Conversation
415
 
@@ -1125,7 +1125,7 @@ fsdp_config:
1125
 
1126
  ##### FSDP + QLoRA
1127
 
1128
- Axolotl supports training with FSDP and QLoRA, see [these docs](docs/fsdp_qlora.md) for more information.
1129
 
1130
  ##### Weights & Biases Logging
1131
 
@@ -1204,7 +1204,7 @@ although this will be very slow, and using the config options above are recommen
1204
 
1205
  ## Common Errors 🧰
1206
 
1207
- See also the [FAQ's](./docs/faq.md) and [debugging guide](docs/debugging.md).
1208
 
1209
  > If you encounter a 'Cuda out of memory' error, it means your GPU ran out of memory during the training process. Here's how to resolve it:
1210
 
@@ -1238,7 +1238,7 @@ It's safe to ignore it.
1238
 
1239
  > NCCL Timeouts during training
1240
 
1241
- See the [NCCL](docs/nccl.md) guide.
1242
 
1243
 
1244
  ### Tokenization Mismatch b/w Inference & Training
@@ -1256,7 +1256,7 @@ Having misalignment between your prompts during training and inference can cause
1256
 
1257
  ## Debugging Axolotl
1258
 
1259
- See [this debugging guide](docs/debugging.md) for tips on debugging Axolotl, along with an example configuration for debugging with VSCode.
1260
 
1261
  ## Need help? πŸ™‹
1262
 
 
149
  ```
150
 
151
  >[!Tip]
152
+ > If you want to debug axolotl or prefer to use Docker as your development environment, see the [debugging guide's section on Docker](docs/debugging.qmd#debugging-with-docker).
153
 
154
  <details>
155
 
 
267
  ```
268
  pip3 install -e '.'
269
  ```
270
+ More info: [mac.md](/docs/mac.qmd)
271
 
272
  #### Launching on public clouds via SkyPilot
273
  To launch on GPU instances (both on-demand and spot instances) on 7+ clouds (GCP, AWS, Azure, OCI, and more), you can use [SkyPilot](https://skypilot.readthedocs.io/en/latest/index.html):
 
409
  {"segments": [{"label": true|false, "text": "..."}]}
410
  ```
411
 
412
+ This is a special format that allows you to construct prompts without using templates. This is for advanced users who want more freedom with prompt construction. See [these docs](docs/input_output.qmd) for more details.
413
 
414
  ##### Conversation
415
 
 
1125
 
1126
  ##### FSDP + QLoRA
1127
 
1128
+ Axolotl supports training with FSDP and QLoRA, see [these docs](docs/fsdp_qlora.qmd) for more information.
1129
 
1130
  ##### Weights & Biases Logging
1131
 
 
1204
 
1205
  ## Common Errors 🧰
1206
 
1207
+ See also the [FAQ's](./docs/faq.qmd) and [debugging guide](docs/debugging.qmd).
1208
 
1209
  > If you encounter a 'Cuda out of memory' error, it means your GPU ran out of memory during the training process. Here's how to resolve it:
1210
 
 
1238
 
1239
  > NCCL Timeouts during training
1240
 
1241
+ See the [NCCL](docs/nccl.qmd) guide.
1242
 
1243
 
1244
  ### Tokenization Mismatch b/w Inference & Training
 
1256
 
1257
  ## Debugging Axolotl
1258
 
1259
+ See [this debugging guide](docs/debugging.qmd) for tips on debugging Axolotl, along with an example configuration for debugging with VSCode.
1260
 
1261
  ## Need help? πŸ™‹
1262
 
_quarto.yml ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ project:
2
+ type: website
3
+
4
+ website:
5
+ title: "Axolotl"
6
+ description: "Fine-tuning"
7
+ favicon: favicon.jpg
8
+ navbar:
9
+ title: Axolotl
10
+ background: dark
11
+ pinned: false
12
+ collapse: false
13
+ tools:
14
+ - icon: twitter
15
+ href: https://twitter.com/axolotl_ai
16
+ - icon: github
17
+ href: https://github.com/OpenAccess-AI-Collective/axolotl/
18
+ - icon: discord
19
+ href: https://discord.gg/7m9sfhzaf3
20
+
21
+ sidebar:
22
+ pinned: true
23
+ collapse-level: 2
24
+ style: docked
25
+ contents:
26
+ - text: Home
27
+ href: index.qmd
28
+ - section: "How-To Guides"
29
+ contents:
30
+ # TODO Edit folder structure after we have more docs.
31
+ - docs/debugging.qmd
32
+ - docs/multipack.qmd
33
+ - docs/fdsp_qlora.qmd
34
+ - docs/input_output.qmd
35
+ - docs/rlhf.qmd
36
+ - docs/nccl.qmd
37
+ - docs/mac.qmd
38
+ - docs/multi-node.qmd
39
+ - section: "Reference"
40
+ contents:
41
+ - docs/config.qmd
42
+ - docs/faq.qmd
43
+
44
+
45
+
46
+
47
+ format:
48
+ html:
49
+ theme: materia
50
+ css: styles.css
51
+ toc: true
devtools/README.md CHANGED
@@ -1 +1 @@
1
- This directory contains example config files that might be useful for debugging. Please see [docs/debugging.md](../docs/debugging.md) for more information.
 
1
+ This directory contains example config files that might be useful for debugging. Please see [docs/debugging.qmd](../docs/debugging.qmd) for more information.
docs/.gitignore ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ /.quarto/
2
+ _site/
docs/config.qmd ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: Config options
3
+ description: A complete list of all configuration options.
4
+ ---
5
+
6
+ ```{python}
7
+ #|echo: false
8
+ #|output: asis
9
+ import re
10
+ # Regex pattern to match the YAML block including its code fence
11
+ pattern = r'<details[^>]*id="all-yaml-options"[^>]*>.*?<summary>All yaml options.*?```yaml(.*?)```.*?</details>'
12
+
13
+ with open('../README.md', 'r') as f:
14
+ doc = f.read()
15
+ match = re.search(pattern, doc, re.DOTALL)
16
+ print("```yaml", match.group(1).strip(), "```", sep="\n")
17
+ ```
docs/{debugging.md β†’ debugging.qmd} RENAMED
@@ -1,4 +1,8 @@
1
- # Debugging Axolotl
 
 
 
 
2
 
3
  This document provides some tips and tricks for debugging Axolotl. It also provides an example configuration for debugging with VSCode. A good debugging setup is essential to understanding how Axolotl code works behind the scenes.
4
 
 
1
+ ---
2
+ title: Debugging
3
+ description: How to debug Axolotl
4
+ ---
5
+
6
 
7
  This document provides some tips and tricks for debugging Axolotl. It also provides an example configuration for debugging with VSCode. A good debugging setup is essential to understanding how Axolotl code works behind the scenes.
8
 
docs/faq.md DELETED
@@ -1,18 +0,0 @@
1
- # Axolotl FAQ's
2
-
3
-
4
- > The trainer stopped and hasn't progressed in several minutes.
5
-
6
- Usually an issue with the GPU's communicating with each other. See the [NCCL doc](../docs/nccl.md)
7
-
8
- > Exitcode -9
9
-
10
- This usually happens when you run out of system RAM.
11
-
12
- > Exitcode -7 while using deepspeed
13
-
14
- Try upgrading deepspeed w: `pip install -U deepspeed`
15
-
16
- > AttributeError: 'DummyOptim' object has no attribute 'step'
17
-
18
- You may be using deepspeed with single gpu. Please don't set `deepspeed:` in yaml or cli.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/faq.qmd ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: FAQ
3
+ description: Frequently asked questions
4
+ ---
5
+
6
+
7
+ **Q: The trainer stopped and hasn't progressed in several minutes.**
8
+
9
+ > A: Usually an issue with the GPUs communicating with each other. See the [NCCL doc](nccl.qmd)
10
+
11
+ **Q: Exitcode -9**
12
+
13
+ > A: This usually happens when you run out of system RAM.
14
+
15
+ **Q: Exitcode -7 while using deepspeed**
16
+
17
+ > A: Try upgrading deepspeed w: `pip install -U deepspeed`
18
+
19
+ **Q: AttributeError: 'DummyOptim' object has no attribute 'step'**
20
+
21
+ > A: You may be using deepspeed with single gpu. Please don't set `deepspeed:` in yaml or cli.
docs/{fsdp_qlora.md β†’ fsdp_qlora.qmd} RENAMED
@@ -1,4 +1,10 @@
1
- # FDSP + QLoRA
 
 
 
 
 
 
2
 
3
  ## Background
4
 
 
1
+ ---
2
+ title: FDSP + QLoRA
3
+ description: Use FSDP with QLoRA to fine-tune large LLMs on consumer GPUs.
4
+ format:
5
+ html:
6
+ toc: true
7
+ ---
8
 
9
  ## Background
10
 
docs/{input_output.md β†’ input_output.qmd} RENAMED
@@ -1,4 +1,7 @@
1
- # Template-free prompt construction with the `input_output` format
 
 
 
2
 
3
  <!-- TOC -->
4
 
 
1
+ ---
2
+ title: Template-free prompt construction
3
+ description: "Template-free prompt construction with the `input_output` format"
4
+ ---
5
 
6
  <!-- TOC -->
7
 
docs/{mac.md β†’ mac.qmd} RENAMED
@@ -1,8 +1,12 @@
1
- # Mac M series support
 
 
 
2
 
3
  Currently Axolotl on Mac is partially usable, many of the dependencies of Axolotl including Pytorch do not support MPS or have incomplete support.
4
 
5
  Current support:
 
6
  - [x] Support for all models
7
  - [x] Full training of models
8
  - [x] LoRA training
 
1
+ ---
2
+ title: Mac M-series
3
+ description: Mac M-series support
4
+ ---
5
 
6
  Currently Axolotl on Mac is partially usable, many of the dependencies of Axolotl including Pytorch do not support MPS or have incomplete support.
7
 
8
  Current support:
9
+
10
  - [x] Support for all models
11
  - [x] Full training of models
12
  - [x] LoRA training
docs/{multi-node.md β†’ multi-node.qmd} RENAMED
@@ -1,4 +1,7 @@
1
- # Multi Node
 
 
 
2
 
3
  You will need to create a configuration for accelerate, either by using `accelerate config` and follow the instructions or you can use one of the preset below:
4
 
 
1
+ ---
2
+ title: Multi Node
3
+ description: How to use Axolotl on multiple machines
4
+ ---
5
 
6
  You will need to create a configuration for accelerate, either by using `accelerate config` and follow the instructions or you can use one of the preset below:
7
 
docs/{multipack.md β†’ multipack.qmd} RENAMED
@@ -1,4 +1,7 @@
1
- # Multipack (Sample Packing)
 
 
 
2
 
3
  ## Visualization of Multipack with Flash Attention
4
 
 
1
+ ---
2
+ title: Multipack (Sample Packing)
3
+ description: Multipack is a technique to pack multiple sequences into a single batch to increase training throughput.
4
+ ---
5
 
6
  ## Visualization of Multipack with Flash Attention
7
 
docs/{nccl.md β†’ nccl.qmd} RENAMED
@@ -1,4 +1,7 @@
1
- # NCCL
 
 
 
2
 
3
  NVIDIA NCCL is a library to facilitate and optimize multi-GPU communication operations, such as broadcast, all-gather, reduce, all-reduce, etc. Broadly, NCCL configuration is highly environment-specific and is configured via several [environment variables](https://docs.nvidia.com/deeplearning/nccl/user-guide/docs/env.html). A common NCCL-related problem occurs when a long-running operation times out causing the training process to abort:
4
 
 
1
+ ---
2
+ title: NCCL
3
+ description: Troubleshooting NCCL issues
4
+ ---
5
 
6
  NVIDIA NCCL is a library to facilitate and optimize multi-GPU communication operations, such as broadcast, all-gather, reduce, all-reduce, etc. Broadly, NCCL configuration is highly environment-specific and is configured via several [environment variables](https://docs.nvidia.com/deeplearning/nccl/user-guide/docs/env.html). A common NCCL-related problem occurs when a long-running operation times out causing the training process to abort:
7
 
docs/{rlhf.md β†’ rlhf.qmd} RENAMED
@@ -1,4 +1,7 @@
1
- # RLHF (Beta)
 
 
 
2
 
3
  ### Overview
4
 
 
1
+ ---
2
+ title: "RLHF (Beta)"
3
+ description: "Reinforcement Learning from Human Feedback is a method whereby a language model is optimized from data using human feedback."
4
+ ---
5
 
6
  ### Overview
7
 
favicon.jpg ADDED
index.qmd ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+
3
+ ```{python}
4
+ #|output: asis
5
+ #|echo: false
6
+
7
+ # This cell steals the README as the home page for now, but excludes the table of contents (quarto adds its own)
8
+ import re
9
+ pattern = re.compile(
10
+ r"<table>\s*<tr>\s*<td>\s*## Table of Contents.*?</td>\s*</tr>\s*</table>",
11
+ re.DOTALL | re.IGNORECASE
12
+ )
13
+
14
+ with open('README.md', 'r') as f:
15
+ txt = f.read()
16
+
17
+ cleaned = pattern.sub("", txt)
18
+ print(cleaned)
19
+ ```
styles.css ADDED
@@ -0,0 +1 @@
 
 
1
+ /* css styles */