|
--- |
|
license: creativeml-openrail-m |
|
language: |
|
- en |
|
tags: |
|
- diffusion |
|
pipeline_tag: text-to-image |
|
--- |
|
|
|
SDXL_v1.0-Dreamviewer |
|
|
|
[SDXL](https://arxiv.org/abs/2307.01952) consists of an [ensemble of experts](https://arxiv.org/abs/2211.01324) pipeline for latent diffusion: |
|
In a first step, the base model is used to generate (noisy) latents, |
|
which are then further processed with a refinement model (available here: https://huggingface.co/stabilityai/stable-diffusion-xl-refiner-1.0/) specialized for the final denoising steps. |
|
Note that the base model can be used as a standalone module. |
|
|
|
Alternatively, we can use a two-stage pipeline as follows: |
|
First, the base model is used to generate latents of the desired output size. |
|
In the second step, we use a specialized high-resolution model and apply a technique called SDEdit (https://arxiv.org/abs/2108.01073, also known as "img2img") |
|
to the latents generated in the first step, using the same prompt. This technique is slightly slower than the first one, as it requires more function evaluations. |
|
|
|
Source code is available at https://github.com/Stability-AI/generative-models . |
|
``` |
|
import torch |
|
from diffusers import StableDiffusionXLPipeline, AutoencoderKL |
|
import gc,cv2,os |
|
from PIL import Image |
|
import requests |
|
from io import BytesIO |
|
from IPython.display import display |
|
import matplotlib.pyplot as plt |
|
|
|
vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16) |
|
|
|
pipe = StableDiffusionXLPipeline.from_pretrained( |
|
"Andyrasika/dreamshaper_sdxl1_diffusion ", torch_dtype=torch.float16, variant="fp16",vae=vae |
|
) |
|
pipe.enable_xformers_memory_efficient_attention() |
|
pipe.to("cuda") |
|
prompt = '8k intricate, highly detailed, digital photography, best quality, masterpiece, a (full body "shot) photo of A warrior man that lived with dragons his whole life is now leading them to battle. torn clothes exposing parts of her body, scratch marks, epic, hyperrealistic, hyperrealism, 8k, cinematic lighting, greg rutkowski, wlop' |
|
negative_prompt='(deformed iris, deformed pupils), text, worst quality, low quality, jpeg artifacts, ugly, duplicate, morbid, mutilated, (extra fingers), (mutated hands), poorly drawn hands, poorly drawn face, mutation, deformed, blurry, dehydrated, bad anatomy, bad proportions, extra limbs, cloned face, disfigured, gross proportions, malformed limbs, missing arms, missing legs, extra arms, extra legs, (fused fingers), (too many fingers), long neck, camera' |
|
|
|
image = pipe(prompt=prompt, |
|
negative_prompt=negative_prompt, |
|
guidance_scale=9.0, |
|
num_inference_steps=50).images[0] |
|
|
|
gc.collect() |
|
torch.cuda.empty_cache() |
|
``` |
|
![11](Screenshot%202023-08-15%20at%205.56.41%20PM.png) |