|
--- |
|
tags: |
|
- text-to-image |
|
- stable-diffusion |
|
--- |
|
|
|
# Control-LoRA Model Card |
|
|
|
|
|
## Introduction |
|
By adding low-rank parameter efficient fine tuning to ControlNet, we introduce ***Control-LoRAs***. This approach offers a more efficient and compact method to bring model control to a wider variety of consumer GPUs. |
|
|
|
For each model below, you'll find: |
|
|
|
- *Rank 256* files (reducing the original `4.7GB` ControlNet models down to `~738MB` Control-LoRA models) and experimental |
|
- *Rank 128* files (reducing to model down to `~377MB`) |
|
|
|
Each Control-LoRA has been trained on a diverse range of image concepts and aspect ratios. |
|
### MiDaS and ClipDrop Depth |
|
data:image/s3,"s3://crabby-images/b050b/b050b1c7d433e5606ffef024a53bf5772c92e874" alt="canny" |
|
|
|
Depth estimation is an image processing technique that determines the distance of objects in a scene, providing a depth map that highlights variations in proximity. |
|
|
|
The Control-LoRA utilizes a grayscale depth map for guided generation. |
|
|
|
In the example above, we compare the depth results of `MiDaS dpt_beit_large_512` and the `Portrait Depth Estimation` (available in the [ClipDrop API by Stability AI](https://clipdrop.co/apis/docs/portrait-depth-estimation)). |
|
|
|
### Canny Edge |
|
data:image/s3,"s3://crabby-images/bab17/bab178292fed067feca728c76c13418a71b08c94" alt="canny" |
|
Canny Edge Detection is an image processing technique that identifies abrupt changes in intensity to highlight edges in an image. |
|
|
|
This Control-LoRA uses the edges from an image to generate the final image. |
|
|
|
### Photograph and Sketch Colorizer |
|
data:image/s3,"s3://crabby-images/0c18d/0c18d185a9d192cfa983147de53868b52e064142" alt="photograph colorizer" |
|
These two Control-LoRAs can be used to colorize images. |
|
|
|
*Recolor* is designed to colorize black and white photographs. |
|
|
|
*Sketch* is designed to color in drawings input as a white-on-black image (either hand-drawn, or created with a `pidi` edge model). |
|
|
|
ComfyUI example workflow: |
|
data:image/s3,"s3://crabby-images/f45e0/f45e0fab3e8391e4f56126bbf00746770cd84af6" alt="comfyui recolor" |
|
|
|
SwarmUI example: |
|
data:image/s3,"s3://crabby-images/c8ac3/c8ac35c01f61535fb20baa06ad0501003233f935" alt="swarmui recolor" |
|
|
|
### Revision |
|
data:image/s3,"s3://crabby-images/0523b/0523bb11d79184f45d20f6df296f760b5cf38311" alt="revision" |
|
Revision is a novel approach of using images to prompt SDXL. |
|
|
|
It uses pooled CLIP embeddings to produce images conceptually similar to the input. It can be used either in addition, or to replace text prompts. |
|
|
|
Revision also includes a blending function for combining multiple image or text concepts, as either positive or negative prompts. |
|
|
|
|