license: mit
This is NOT A GUIDE for LoRA training, and please NEVER treat it as.
This serves as a onging documentation that I share some of the personal opinions or interesting findings where I learned or discovered during my journey in LoRA training.
To start with, LoRA training could be easy and quick with simple dataset preparation and training setup; but surely that's not what I'm satisfied with.
Based on my personal experience, preparing a LoRA's dataset properly is more important than using complicated and fancy training techniques. Just like cooking, you can't make any tasty dishes using materials without good quality.
Dataset Preparation
In this section, I'll outline the steps I typically follow to prepare the dataset for a LoRA model. I'll skip the "How to Get Image Data" part since there are numerous tools and methods available for obtaining image data.
Remove Non-Static Image Files: After gathering all the necessary image data for a LoRA, remove all non-static image files (e.g., .mp4, .wav, .gif, etc.). Then, convert all non-.png format images into .png.
Remove Duplicates: Take some time to go through the images and remove any duplicates. This is particularly important for large datasets, as duplicates can occur frequently.
Upscale Low-Resolution Images: Upscale all low-resolution images to at least 2k resolution. This step is crucial for improving image quality and ensuring better performance during training. Let's consider an example:
We have two identical PNG images, one with dimensions 907x823 pixels and the other with dimensions 3624x3288 pixels, achieved by a simple 4x upscale from SD WebUI.
Now, let's crop them to roughly 1k resolution (960x832 pixels in this case) using SD WebUI's Auto-size Crop:
Notice the difference? The image with 4x upscale has sharper edges compared to the one without any upscale, resulting in a cleaner and less blurry appearance when cropped into 1k resolution.
So, why is this important for LoRA learning?
Inside the LoRA training script, there's a function called bucket that automatically crops your image data to a size close to the training resolution specified in the training setup.
In this case, the pattern inside a clean image could be learned quickly; and for a somewhat low-resolution image that comes in the first place, upscaling helps sharpen the blurry lines so that the object inside is more easier to be recognized.
To be Continued ...