File size: 2,607 Bytes
172d5f2
 
 
 
 
 
 
 
 
94dd666
2b7de13
c0314c1
442edd1
 
 
 
172d5f2
d5c352b
062ce85
d755d29
d582a35
94dd666
 
 
d5c352b
d755d29
2b7de13
 
172d5f2
d755d29
8799257
 
2b7de13
172d5f2
 
0323099
8799257
 
0602d89
8799257
0602d89
172d5f2
 
8799257
 
 
 
 
b7232c2
 
062ce85
 
 
 
 
 
 
28dc934
062ce85
28dc934
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
---
tags:
- text-to-image
- stable-diffusion
---

# Control-LoRA Model Card


## Introduction
By adding low-rank parameter efficient fine tuning to ControlNet, we introduce ***Control-LoRAs***. This approach offers a more efficient and compact method to bring model control to a wider variety of consumer GPUs.

For each model below, you'll find:

- *Rank 256* files (reducing the original `4.7GB` ControlNet models down to `~738MB` Control-LoRA models) and experimental
- *Rank 128* files (reducing to model down to `~377MB`)

Each Control-LoRA has been trained on a diverse range of image concepts and aspect ratios.

### MiDaS and ClipDrop Depth
![canny](samples/depth-sample.jpeg)

Depth estimation is an image processing technique that determines the distance of objects in a scene, providing a depth map that highlights variations in proximity.

The Control-LoRA utilizes a grayscale depth map for guided generation.

In the example above, we compare the depth results of `MiDaS dpt_beit_large_512` and the `Portrait Depth Estimation` (available in the [ClipDrop API by Stability AI](https://clipdrop.co/apis/docs/portrait-depth-estimation)).

### Canny Edge
![canny](samples/canny-sample.jpeg)
Canny Edge Detection is an image processing technique that identifies abrupt changes in intensity to highlight edges in an image.

This Control-LoRA uses the edges from an image to generate the final image.

### Photograph and Sketch Colorizer
![photograph colorizer](samples/colorizer-sample.jpeg)
These two Control-LoRAs can be used to colorize images.

*Recolor* is designed to colorize black and white photographs.

*Sketch* is designed to color in drawings input as a white-on-black image (either hand-drawn, or created with a `pidi` edge model).

### Revision
![revision](thumbnails/stability-clora-revision-thumbnail.jpeg)
Revision is a novel approach of using images to prompt SDXL.

It uses pooled CLIP embeddings to produce images conceptually similar to the input. It can be used either in addition, or to replace text prompts.

Revision also includes a blending function for combining multiple image or text concepts, as either positive or negative prompts.


## Inference

Control-LoRAs have been implemented into [ComfyUI](https://github.com/comfyanonymous/ComfyUI) and [StableSwarmUI](https://github.com/Stability-AI/StableSwarmUI)

Basic ComfyUI workflows (using the base model only) are available in this repo.

**Recolor example on ComfyUI:** ![comfyui recolor](samples/comfyui-recolor-example.jpeg)

**Canny edge on StableSwarmUI:** ![swarmui recolor](samples/swarmui-canny-example.jpeg)