Spaces:
Runtime error
Runtime error
File size: 12,145 Bytes
e682b4a 0aae4be e682b4a 0ff087c ad8e498 0ff087c ad8e498 0ff087c e682b4a 0aae4be e682b4a 9119c1f e682b4a 0aae4be e682b4a 0aae4be e682b4a 9119c1f e682b4a 9119c1f e682b4a 9119c1f e682b4a 0aae4be e682b4a 9119c1f ad8e498 9119c1f ad8e498 9119c1f ad8e498 9119c1f ad8e498 9119c1f ad8e498 9119c1f ad8e498 9119c1f e682b4a 0ff087c 9119c1f |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 |
import gradio as gr
import torch
from diffusers import LMSDiscreteScheduler
from mixdiff import StableDiffusionCanvasPipeline, Text2ImageRegion
article = """
## Usage
In this demo you can use Mixture of Diffusers to configure a canvas made up of 3 diffusion regions. Play around with the prompts and guidance values in each region! You can also change increment the overlap between regions if seams appear in the image.
In the full version of Mixture of Diffusers you will find further freedom to configure the regions in the canvas. Check the [github repo](https://github.com/albarji/mixture-of-diffusers)!
## Motivation
Current image generation methods, such as Stable Diffusion, struggle to position objects at specific locations. While the content of the generated image (somewhat) reflects the objects present in the prompt, it is difficult to frame the prompt in a way that creates an specific composition. For instance, take a prompt expressing a complex composition such as
> A charming house in the countryside on the left,
> in the center a dirt road in the countryside crossing pastures,
> on the right an old and rusty giant robot lying on a dirt road,
> by jakub rozalski,
> sunset lighting on the left and center, dark sunset lighting on the right
> elegant, highly detailed, smooth, sharp focus, artstation, stunning masterpiece
Out of a sample of 20 Stable Diffusion generations with different seeds, the generated images that align best with the prompt are the following:
<table>
<tr>
<td><img src="https://user-images.githubusercontent.com/9654655/195373001-ad23b7c4-f5b1-4e5b-9aa1-294441ed19ed.png" width="300"></td>
<td><img src="https://user-images.githubusercontent.com/9654655/195373174-8d85dd96-310e-48fa-b112-d9902685f22e.png" width="300"></td>
<td><img src="https://user-images.githubusercontent.com/9654655/195373200-59eeec1e-e1b8-464d-b72e-e28a9004d269.png" width="300"></td>
</tr>
</table>
The method proposed here strives to provide a better tool for image composition by using several diffusion processes in parallel, each configured with a specific prompt and settings, and focused on a particular region of the image. You can try it out in the example above! The mixture of diffusion processes is done in a way that harmonizes the generation process, preventing "seam" effects in the generated image.
Using several diffusion processes in parallel has also practical advantages when generating very large images, as the GPU memory requirements are similar to that of generating an image of the size of a single tile.
## Responsible use
The same recommendations as in Stable Diffusion apply, so please check the corresponding [model card](https://huggingface.co/CompVis/stable-diffusion-v1-4).
More broadly speaking, always bear this in mind: YOU are responsible for the content you create using this tool. Do not fully blame, credit, or place the responsibility on the software.
## Gallery
Here are some relevant illustrations created using this software (and putting quite a few hours into them!).
### Darkness Dawning
![Darkness Dawning](https://images-wixmp-ed30a86b8c4ca887773594c2.wixmp.com/f/cd1358aa-80d5-4c59-b95b-cdfde5dcc4f5/dfidq8n-6da9a886-9f1c-40ae-8341-d77af9552395.png?token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJzdWIiOiJ1cm46YXBwOjdlMGQxODg5ODIyNjQzNzNhNWYwZDQxNWVhMGQyNmUwIiwiaXNzIjoidXJuOmFwcDo3ZTBkMTg4OTgyMjY0MzczYTVmMGQ0MTVlYTBkMjZlMCIsIm9iaiI6W1t7InBhdGgiOiJcL2ZcL2NkMTM1OGFhLTgwZDUtNGM1OS1iOTViLWNkZmRlNWRjYzRmNVwvZGZpZHE4bi02ZGE5YTg4Ni05ZjFjLTQwYWUtODM0MS1kNzdhZjk1NTIzOTUucG5nIn1dXSwiYXVkIjpbInVybjpzZXJ2aWNlOmZpbGUuZG93bmxvYWQiXX0.ff6XoVBPdUbcTLcuHUpQMPrD2TaXBM_s6HfRhsARDw0)
### Yog-Sothoth
![Yog-Sothoth](https://images-wixmp-ed30a86b8c4ca887773594c2.wixmp.com/f/cd1358aa-80d5-4c59-b95b-cdfde5dcc4f5/dfidsq4-174dd428-2c5a-48f6-a78f-9441fb3cffea.png?token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJzdWIiOiJ1cm46YXBwOjdlMGQxODg5ODIyNjQzNzNhNWYwZDQxNWVhMGQyNmUwIiwiaXNzIjoidXJuOmFwcDo3ZTBkMTg4OTgyMjY0MzczYTVmMGQ0MTVlYTBkMjZlMCIsIm9iaiI6W1t7InBhdGgiOiJcL2ZcL2NkMTM1OGFhLTgwZDUtNGM1OS1iOTViLWNkZmRlNWRjYzRmNVwvZGZpZHNxNC0xNzRkZDQyOC0yYzVhLTQ4ZjYtYTc4Zi05NDQxZmIzY2ZmZWEucG5nIn1dXSwiYXVkIjpbInVybjpzZXJ2aWNlOmZpbGUuZG93bmxvYWQiXX0.X42zWgsk3lYnYwuEgkifRFRH2km-npHvrdleDN3m6bA)
### Looking through the eyes of giants
![Looking through the eyes of giants](https://user-images.githubusercontent.com/9654655/218307148-95ce88b6-b2a3-458d-b469-daf5bd56e3a7.jpg)
[Follow me on DeviantArt for more!](https://www.deviantart.com/albarji)
## Acknowledgements
First and foremost, my most sincere appreciation for the [Stable Diffusion team](https://stability.ai/blog/stable-diffusion-public-release) for releasing such an awesome model, and for letting me take part of the closed beta. Kudos also to the Hugging Face community and developers for implementing the [Diffusers library](https://github.com/huggingface/diffusers).
Thanks to Hugging Face for providing support and a GPU spaces for running this demo. Thanks also to Instituto de Ingeniería del Conocimiento and Grupo de Aprendizaje Automático (Universidad Autónoma de Madrid) for providing GPU resources for testing and experimenting this library.
Thanks also to the vibrant communities of the Stable Diffusion discord channel and [Lexica](https://lexica.art/), where I have learned about many amazing artists and styles. And to my friend Abril for sharing many tips on cool artists!
"""
# Creater scheduler and model (similar to StableDiffusionPipeline)
scheduler = LMSDiscreteScheduler(beta_start=0.00085, beta_end=0.012, beta_schedule="scaled_linear", num_train_timesteps=1000)
pipeline = StableDiffusionCanvasPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", scheduler=scheduler).to("cuda" if torch.cuda.is_available() else "cpu")
def generate(prompt1, prompt2, prompt3, gc1, gc2, gc3, overlap, steps, seed):
"""Mixture of Diffusers generation"""
tile_width = 640
tile_height = 640
return pipeline(
canvas_height=tile_height,
canvas_width=tile_width + (tile_width - overlap) * 2,
regions=[
Text2ImageRegion(0, tile_height, 0, tile_width, guidance_scale=gc1,
prompt=prompt1),
Text2ImageRegion(0, tile_height, tile_width - overlap, tile_width - overlap + tile_width, guidance_scale=gc2,
prompt=prompt2),
Text2ImageRegion(0, tile_height, (tile_width - overlap) * 2, (tile_width - overlap) * 2 + tile_width, guidance_scale=gc3,
prompt=prompt3),
],
num_inference_steps=steps,
seed=seed,
)["sample"][0]
with gr.Blocks(title="Mixture of Diffusers") as demo:
gr.HTML(
"""
<div style="text-align: center; max-width: 700px; margin: 0 auto;">
<div
style="
display: inline-flex;
align-items: center;
gap: 0.8rem;
font-size: 1.75rem;
"
>
<h1 style="font-weight: 1000; margin-bottom: 7px; line-height: normal;">
Mixture of Diffusers
</h1>
</div>
<p style="margin-bottom: 10px; font-size: 94%">
<a href="https://arxiv.org/abs/2302.02412">[Paper]</a> <a href="https://github.com/albarji/mixture-of-diffusers">[Code in Github]</a> <a href="https://huggingface.co/spaces/albarji/mixture-of-diffusers?duplicate=true">
</p>
</div>
"""
)
gr.HTML("""
<p>For faster inference without waiting in queue, you may duplicate the space and upgrade to GPU in settings.
<br/>
<a href="https://huggingface.co/spaces/albarji/mixture-of-diffusers?duplicate=true">
<img style="margin-top: 0em; margin-bottom: 0em" src="https://bit.ly/3gLdBN6" alt="Duplicate Space"></a>
<p/>
""")
with gr.Row():
with gr.Column(scale=1):
gr.Markdown("### Left region")
left_prompt = gr.Textbox(lines=2, label="Prompt (what do you want to see in the left side of the image?)")
left_gs = gr.Slider(minimum=0, maximum=15, value=8, step=1, label="Guidance scale")
with gr.Column(scale=1):
gr.Markdown("### Center region")
center_prompt = gr.Textbox(lines=2, label="Prompt (what do you want to see in the center of the image?)")
center_gs = gr.Slider(minimum=0, maximum=15, value=8, step=1, label="Guidance scale")
with gr.Column(scale=1):
gr.Markdown("### Right region")
right_prompt = gr.Textbox(lines=2, label="Prompt (what do you want to see in the right side of the image?)")
right_gs = gr.Slider(minimum=0, maximum=15, value=8, step=1, label="Guidance scale")
gr.Markdown("### General parameters")
with gr.Row():
with gr.Column(scale=1):
overlap = gr.Slider(minimum=128, maximum=320, value=256, step=8, label="Overlap between diffusion regions")
with gr.Column(scale=1):
steps = gr.Slider(minimum=1, maximum=50, value=15, step=1, label="Number of diffusion steps")
with gr.Column(scale=1):
seed = gr.Number(value=12345, precision=0, label="Random seed")
with gr.Row():
button = gr.Button(value="Generate")
with gr.Row():
output = gr.Image(label="Generated image")
with gr.Row():
gr.Examples(
examples=[
[
"A charming house in the countryside, by jakub rozalski, sunset lighting, elegant, highly detailed, smooth, sharp focus, artstation, stunning masterpiece",
"A dirt road in the countryside crossing pastures, by jakub rozalski, sunset lighting, elegant, highly detailed, smooth, sharp focus, artstation, stunning masterpiece",
"An old and rusty giant robot lying on a dirt road, by jakub rozalski, dark sunset lighting, elegant, highly detailed, smooth, sharp focus, artstation, stunning masterpiece",
8, 8, 8,
256,
50,
7178915308
],
[
"Abstract decorative illustration, by joan miro and gustav klimt and marlina vera and loish, elegant, intricate, highly detailed, smooth, sharp focus, vibrant colors, artstation, stunning masterpiece",
"Abstract decorative illustration, by joan miro and gustav klimt and marlina vera and loish, elegant, intricate, highly detailed, smooth, sharp focus, vibrant colors, artstation, stunning masterpiece",
"Abstract decorative illustration, by joan miro and gustav klimt and marlina vera and loish, elegant, intricate, highly detailed, smooth, sharp focus, vibrant colors, artstation, stunning masterpiece",
8, 8, 8,
256,
35,
21156517
],
[
"Magical diagrams and runes written with chalk on a blackboard, elegant, intricate, highly detailed, smooth, sharp focus, artstation, stunning masterpiece",
"Magical diagrams and runes written with chalk on a blackboard, elegant, intricate, highly detailed, smooth, sharp focus, artstation, stunning masterpiece",
"Magical diagrams and runes written with chalk on a blackboard, elegant, intricate, highly detailed, smooth, sharp focus, artstation, stunning masterpiece",
12, 12, 12,
256,
35,
12591765619
]
],
inputs=[left_prompt, center_prompt, right_prompt, left_gs, center_gs, right_gs, overlap, steps, seed],
outputs=output,
fn=generate,
cache_examples=True
)
button.click(
generate,
inputs=[left_prompt, center_prompt, right_prompt, left_gs, center_gs, right_gs, overlap, steps, seed],
outputs=output
)
with gr.Row():
gr.Markdown(article)
demo.launch(server_name="0.0.0.0")
|