k4d3 commited on
Commit
8b74397
1 Parent(s): 2e5c42f

Signed-off-by: Balazs Horvath <acsipont@gmail.com>

Files changed (1) hide show
  1. README.md +60 -7
README.md CHANGED
@@ -30,11 +30,17 @@ The Yiff Toolkit is a comprehensive set of tools designed to enhance your creati
30
  - [Pony Training](#pony-training)
31
  - [Download Pony in Diffusers Format](#download-pony-in-diffusers-format)
32
  - [Sample Prompt File](#sample-prompt-file)
 
 
 
 
 
33
  - [`--dataset_repeats`](#--dataset_repeats)
34
  - [`--max_train_steps`](#--max_train_steps)
35
  - [`--shuffle_caption`](#--shuffle_caption)
36
  - [`--sdpa`](#--sdpa)
37
- - [`--sample_sampler`](#--sample_sampler)
 
38
  - [Embeddings for 1.5 and SDXL](#embeddings-for-15-and-sdxl)
39
  - [ComfyUI Walkthrough any%](#comfyui-walkthrough-any)
40
  - [AnimateDiff for Masochists](#animatediff-for-masochists)
@@ -102,17 +108,20 @@ The Yiff Toolkit is a comprehensive set of tools designed to enhance your creati
102
 
103
  ### Installation Tips
104
 
105
- Firstly, download kohya_ss' [sd-scripts](https://github.com/kohya-ss/sd-scripts), you need to set up your environment either like [this](https://github.com/kohya-ss/sd-scripts?tab=readme-ov-file#windows-installation) tells you for Windows, or if you are using Linux or Miniconda on Windows, you are probably smart enough to figure out the installation for it. I recommend always installing the latest [PyTorch](https://pytorch.org/get-started/locally/) in the virtual environment you are going to use, which at the time of writing is `2.2.2`. I hope future me has faster PyTorch!
106
 
107
- If someone told you to install `xformers` call them stinky, because ever since the fused implementation of `sdpa` landed in torch it has been the king of my benchmarks.
108
- For training you will have to go with either `--sdpa` or `--xformers`
109
 
110
  ### Dataset Preparation
111
 
 
 
112
  ⚠️ **TODO:** Awoo this section.
113
 
114
  ### Pony Training
115
 
 
 
116
  I'm not going to lie, it is a bit complicated to explain everything. But here is my best attempt going through some "basic" stuff and almost all lines in order.
117
 
118
  #### Download Pony in Diffusers Format
@@ -125,7 +134,7 @@ git clone https://huggingface.co/k4d3/ponydiffusers
125
 
126
  #### Sample Prompt File
127
 
128
- A sample prompt file is used during training to sample images. A sample prompt for example might look like this for Pony.
129
 
130
  ```py
131
  # anthro female kindred
@@ -136,6 +145,42 @@ score_9, score_8_up, score_7_up, score_6_up, rating_explicit, source_furry, solo
136
  score_9, score_8_up, score_7_up, score_6_up, rating_explicit, source_furry, solo, anthro male fox, glowing yellow eyes, night, crescent moon, tibetan necklace, gold bracers, blue and gold adorned loincloth, canine genitalia, knot, amazing_background, scenery porn, white marble ruins in the background, realistic, photo, photo (medium), photography (artwork) --n low quality, worst quality --w 1024 --h 1024 --d 1 --l 6.0 --s 40
137
  ```
138
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
139
  #### `--dataset_repeats`
140
 
141
  Repeats the dataset when training with captions, by default it is set to `1` so we'll set this to `0` with:
@@ -164,24 +209,32 @@ As you can tell, I have separated the caption part not just the tags with a `,`
164
 
165
  The choice between `--xformers` and `--spda` will depend on your GPU. You can benchmark it by repeating a training with both!
166
 
167
- #### `--sample_sampler`
168
 
169
  You have the option of generating images during training so you can check the progress, the argument let's you pick between different samplers, by default it is on `ddim`, so you better change it!
170
 
171
  You can also use `--sample_every_n_epochs` instead which will take precedence over steps. The `k_` prefix means karras and the `_a` suffix means ancestral.
172
 
173
  ```py
 
174
  --sample_sampler="euler_a" \
175
  --sample_every_n_steps=100
176
  ```
177
 
178
  My recommendation for Pony is to use `euler_a` for toony and for realistic `k_dpm_2`.
179
- Your options include the following:
 
180
 
181
  ```bash
182
  ddim, pndm, lms, euler, euler_a, heun, dpm_2, dpm_2_a, dpmsolver, dpmsolver++, dpmsingle, k_lms, k_euler, k_euler_a, k_dpm_2, k_dpm_2_a
183
  ```
184
 
 
 
 
 
 
 
185
  ## Embeddings for 1.5 and SDXL
186
 
187
  Embeddings in Stable Diffusion are high-dimensional representations of input data, such as images or text, that capture their essential features and relationships. These embeddings are used to guide the diffusion process, enabling the model to generate outputs that closely match the desired characteristics specified in the input.
 
30
  - [Pony Training](#pony-training)
31
  - [Download Pony in Diffusers Format](#download-pony-in-diffusers-format)
32
  - [Sample Prompt File](#sample-prompt-file)
33
+ - [`--lowram`](#--lowram)
34
+ - [`--pretrained_model_name_or_path`](#--pretrained_model_name_or_path)
35
+ - [`--train_data_dir`](#--train_data_dir)
36
+ - [`--resolution`](#--resolution)
37
+ - [`--optimizer_type`](#--optimizer_type)
38
  - [`--dataset_repeats`](#--dataset_repeats)
39
  - [`--max_train_steps`](#--max_train_steps)
40
  - [`--shuffle_caption`](#--shuffle_caption)
41
  - [`--sdpa`](#--sdpa)
42
+ - [`--sample_prompts --sample_sampler --sample_every_n_steps`](#--sample_prompts---sample_sampler---sample_every_n_steps)
43
+ - [CosXL Training](#cosxl-training)
44
  - [Embeddings for 1.5 and SDXL](#embeddings-for-15-and-sdxl)
45
  - [ComfyUI Walkthrough any%](#comfyui-walkthrough-any)
46
  - [AnimateDiff for Masochists](#animatediff-for-masochists)
 
108
 
109
  ### Installation Tips
110
 
111
+ ---
112
 
113
+ Firstly, download kohya_ss' [sd-scripts](https://github.com/kohya-ss/sd-scripts), you need to set up your environment either like [this](https://github.com/kohya-ss/sd-scripts?tab=readme-ov-file#windows-installation) tells you for Windows, or if you are using Linux or Miniconda on Windows, you are probably smart enough to figure out the installation for it. I recommend always installing the latest [PyTorch](https://pytorch.org/get-started/locally/) in the virtual environment you are going to use, which at the time of writing is `2.2.2`. I hope future me has faster PyTorch!
 
114
 
115
  ### Dataset Preparation
116
 
117
+ ---
118
+
119
  ⚠️ **TODO:** Awoo this section.
120
 
121
  ### Pony Training
122
 
123
+ ---
124
+
125
  I'm not going to lie, it is a bit complicated to explain everything. But here is my best attempt going through some "basic" stuff and almost all lines in order.
126
 
127
  #### Download Pony in Diffusers Format
 
134
 
135
  #### Sample Prompt File
136
 
137
+ A sample prompt file is used during training to sample images. A sample prompt for example might look like this for Pony:
138
 
139
  ```py
140
  # anthro female kindred
 
145
  score_9, score_8_up, score_7_up, score_6_up, rating_explicit, source_furry, solo, anthro male fox, glowing yellow eyes, night, crescent moon, tibetan necklace, gold bracers, blue and gold adorned loincloth, canine genitalia, knot, amazing_background, scenery porn, white marble ruins in the background, realistic, photo, photo (medium), photography (artwork) --n low quality, worst quality --w 1024 --h 1024 --d 1 --l 6.0 --s 40
146
  ```
147
 
148
+ #### `--lowram`
149
+
150
+ If you are running running out of RAM like I do with 2 GPUs and a really fat model, this option will help you save a bit of it and might get you out of OOM hell.
151
+
152
+ #### `--pretrained_model_name_or_path`
153
+
154
+ The directory containing the checkpoint you just downloaded. I recommend closing the path if you are using a local model with a `/`.
155
+
156
+ ```py
157
+ --pretrained_model_name_or_path="/ponydiffusers/" \
158
+ ```
159
+
160
+ #### `--train_data_dir`
161
+
162
+ The directory containing the dataset. We prepared this earlier together.
163
+
164
+ ```py
165
+ --train_data_dir="/training_dir" \
166
+ ```
167
+
168
+ #### `--resolution`
169
+
170
+ Always set this to match the model's resolution, which in Pony's case it is 1024x1024. If you can't fit into the VRAM, you can decrease it to `512,512` as a last resort.
171
+
172
+ ```py
173
+ --resolution="512,512" \
174
+ ```
175
+
176
+ #### `--optimizer_type`
177
+
178
+ The default optimizer is `AdamW` and there are a bunch of them added every month or so, therefore I'm not listing it. You can find the list if you really want. But `AdamW` is the best as of this writing so we use that!
179
+
180
+ ```py
181
+ --optimizer_type="AdamW" \
182
+ ```
183
+
184
  #### `--dataset_repeats`
185
 
186
  Repeats the dataset when training with captions, by default it is set to `1` so we'll set this to `0` with:
 
209
 
210
  The choice between `--xformers` and `--spda` will depend on your GPU. You can benchmark it by repeating a training with both!
211
 
212
+ #### `--sample_prompts --sample_sampler --sample_every_n_steps`
213
 
214
  You have the option of generating images during training so you can check the progress, the argument let's you pick between different samplers, by default it is on `ddim`, so you better change it!
215
 
216
  You can also use `--sample_every_n_epochs` instead which will take precedence over steps. The `k_` prefix means karras and the `_a` suffix means ancestral.
217
 
218
  ```py
219
+ --sample_prompts=/training_dir/sample-prompts.txt
220
  --sample_sampler="euler_a" \
221
  --sample_every_n_steps=100
222
  ```
223
 
224
  My recommendation for Pony is to use `euler_a` for toony and for realistic `k_dpm_2`.
225
+
226
+ Your sampler options include the following:
227
 
228
  ```bash
229
  ddim, pndm, lms, euler, euler_a, heun, dpm_2, dpm_2_a, dpmsolver, dpmsolver++, dpmsingle, k_lms, k_euler, k_euler_a, k_dpm_2, k_dpm_2_a
230
  ```
231
 
232
+ ### CosXL Training
233
+
234
+ The only difference between CosXL training is that you need to enable `--v_parameterization`, and you can't sample the images. 😹 I also don't recommend using the `block_dims` and `block_alphas` from Pony.
235
+
236
+ ---
237
+
238
  ## Embeddings for 1.5 and SDXL
239
 
240
  Embeddings in Stable Diffusion are high-dimensional representations of input data, such as images or text, that capture their essential features and relationships. These embeddings are used to guide the diffusion process, enabling the model to generate outputs that closely match the desired characteristics specified in the input.