Stable-fast-xl

Stable-fast is an ultra lightweight inference optimization framework for HuggingFace Diffusers on NVIDIA GPUs. stable-fast provides super fast inference optimization by utilizing some key techniques. this repository contains a compact installation of the stable-fast compiler https://github.com/chengzeyi/stable-fast and its inference with the stable-diffusion-xl-base-1.0 Inference with stable-diffusion-xl-base-1.0) and stable-diffusion-xl-1.0-inpainting-0.1

image.png

image.png

Inference SDXL model 30%+ faster!!!

Differences With Other Acceleration Libraries

Fast:

stable-fast is specialy optimized for HuggingFace Diffusers. It achieves a high performance across many libraries. And it provides a very fast compilation speed within only a few seconds. It is significantly faster than torch.compile, TensorRT and AITemplate in compilation time.

Minimal:

stable-fast works as a plugin framework for PyTorch. It utilizes existing PyTorch functionality and infrastructures and is compatible with other acceleration techniques, as well as popular fine-tuning techniques and deployment solutions.

How to use

Install dependencies

pip install diffusers transformers safetensors accelerate sentencepiece

Download repository and run script for stable-fast installation

git clone https://huggingface.co/artemtumch/stable-fast-xl
cd stable-fast-xl

open install_stable-fast.sh file and change cp311 for your python version in this line

pip install -q https://github.com/chengzeyi/stable-fast/releases/download/v0.0.15/stable_fast-0.0.15+torch210cu118-cp311-cp311-manylinux2014_x86_64.whl

where cp311 -> for python 3.11 | cp38 -> for python3.8

then run script

sh install_stable-fast.sh

Generate image

from diffusers import DiffusionPipeline
import torch

from sfast.compilers.stable_diffusion_pipeline_compiler import (
compile, CompilationConfig
)

import xformers
import triton

pipe = DiffusionPipeline.from_pretrained(
"stabilityai/stable-diffusion-xl-base-1.0",
torch_dtype=torch.float16,
use_safetensors=True,
variant="fp16"
)

# enable to reduce GPU VRAM usage (~30%)
# pipe.vae = AutoencoderTiny.from_pretrained("madebyollin/taesdxl", torch_dtype=torch.float16)

pipe.to("cuda")

# if using torch < 2.0
# pipe.enable_xformers_memory_efficient_attention()

config = CompilationConfig.Default()

config.enable_xformers = True
config.enable_triton = True
config.enable_cuda_graph = True

pipe = compile(pipe, config)

prompt = "An astronaut riding a green horse"

images = pipe(prompt=prompt).images[0]

Inpainting

from diffusers import StableDiffusionXLInpaintPipeline
from diffusers.utils import load_image
import torch

from sfast.compilers.stable_diffusion_pipeline_compiler import (
compile, CompilationConfig
)

import xformers
import triton

pipe = StableDiffusionXLInpaintPipeline.from_pretrained(
"diffusers/stable-diffusion-xl-1.0-inpainting-0.1",
torch_dtype=torch.float16,
variant="fp16"
)

# enable to reduce GPU VRAM usage (~30%)
# pipe.vae = AutoencoderTiny.from_pretrained("madebyollin/taesdxl", torch_dtype=torch.float16)

pipe.to("cuda")

config = CompilationConfig.Default()

config.enable_xformers = True
config.enable_triton = True
config.enable_cuda_graph = True

pipe = compile(pipe, config)

img_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo.png"
mask_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo_mask.png"

image = load_image(img_url).resize((1024, 1024))
mask_image = load_image(mask_url).resize((1024, 1024))

prompt = "a tiger sitting on a park bench"
generator = torch.Generator(device="cuda").manual_seed(0)

image = pipe(
prompt=prompt,
image=image,
mask_image=mask_image,
guidance_scale=8.0,
num_inference_steps=20, # steps between 15 and 30 work well
strength=0.99, # make sure to use `strength` below 1.0
generator=generator,
).images[0]

Github repository https://github.com/reznya22/stable-fast-xl

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for artemtumch/stable-fast-xl

Finetuned
(1143)
this model