Spaces:
Running
A newer version of the Gradio SDK is available:
5.27.0
title: Visual Ai
emoji: πΌ
colorFrom: purple
colorTo: red
sdk: gradio
sdk_version: 5.20.1
app_file: app.py
pinned: false
license: mit
short_description: What you wish to see in the output image.
Stable Diffusion Image Generator
Overview
This project provides a Stable Diffusion image generator powered by the stabilityai/stable-diffusion-2-1
model. Itβs optimized for GPU execution with CUDA but includes a CPU fallback option, allowing flexibility based on hardware availability. The application uses the diffusers
library and a gradio
-based UI for interactive image generation.
Features
- Runs on GPU (CUDA) with FP16 precision and memory optimizations or CPU with FP32 precision.
- Customizable parameters: prompt, resolution, seed, inference steps, and guidance scale.
- Toggle between GPU and CPU execution via the UI.
- Built-in performance optimizations for GPU (e.g., memory-efficient attention, tiling).
Prerequisites
- Python 3.8+
- A CUDA-compatible GPU (optional but recommended for performance).
- A Hugging Face account and API token for model access.
Required Dependencies
torch
(with CUDA support for GPU usage)diffusers
(for the Stable Diffusion pipeline)gradio
(for the UI)huggingface_hub
(for authentication)xformers
(optional, for GPU memory optimization)transformers
(transitive dependency ofdiffusers
)
Install Dependencies
For GPU support (adjust PyTorch CUDA version as needed):
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
pip install diffusers[torch] gradio huggingface_hub transformers
pip install xformers # Optional, for GPU memory optimization
For CPU-only:
pip install torch torchvision torchaudio
pip install diffusers[torch] gradio huggingface_hub transformers
Environment Setup
Set your Hugging Face API token as an environment variable:
export HUGGINGFACE_TOKEN=your_huggingface_api_token
Run the Application
python app.py
This launches a Gradio UI where you can input parameters and generate images.
Code Implementation
The pipeline dynamically selects the device (cuda
or cpu
) based on availability and user preference. Hereβs a summary of the implementation:
import torch
from diffusers import StableDiffusionPipeline
import gradio as gr
import os
import time
import logging
from huggingface_hub import login
# Logging setup
logging.basicConfig(level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s")
# Load and authenticate with Hugging Face token
hf_token = os.getenv("HUGGINGFACE_TOKEN")
if not hf_token:
raise ValueError("β Error: Hugging Face token not found!")
login(token=hf_token)
# Model setup
model_id = "stabilityai/stable-diffusion-2-1"
device = "cuda" if torch.cuda.is_available() else "cpu"
torch_dtype = torch.float16 if device == "cuda" else torch.float32
pipe = StableDiffusionPipeline.from_pretrained(
model_id,
torch_dtype=torch_dtype,
revision="fp16" if device == "cuda" else None,
use_auth_token=hf_token
)
# GPU optimizations (if applicable)
if device == "cuda":
pipe.to("cuda")
pipe.enable_xformers_memory_efficient_attention()
pipe.vae.enable_tiling()
pipe.enable_attention_slicing()
torch.backends.cuda.matmul.allow_tf32 = True
logging.info(f"π Running on: {device.upper()} with {torch_dtype}")
# Image generation function
def generate_image(prompt, seed, resolution, steps, guidance, use_gpu):
device = "cuda" if use_gpu and torch.cuda.is_available() else "cpu"
pipe.to(device)
width, height = map(int, resolution.split("x"))
generator = torch.Generator(device).manual_seed(int(seed)) if seed != "-1" else None
with torch.autocast("cuda") if device == "cuda" else torch.no_grad():
image = pipe(prompt, num_inference_steps=int(steps), guidance_scale=float(guidance),
generator=generator, width=width, height=height).images[0]
return image
# Gradio UI setup
with gr.Blocks() as demo:
gr.Markdown("# ποΈ **Stable Diffusion Image Generator**")
with gr.Row():
with gr.Column():
prompt_input = gr.Textbox(label="π¨ Prompt")
resolution_input = gr.Textbox(label="π Resolution", value="512x512")
seed_input = gr.Textbox(label="π’ Seed (-1 for random)", value="-1")
steps_input = gr.Slider(10, 50, value=30, label="π οΈ Inference Steps")
guidance_input = gr.Slider(1.0, 15.0, value=7.5, label="ποΈ Guidance Scale")
gpu_toggle = gr.Checkbox(label="β‘ Use GPU (if available)", value=True)
generate_button = gr.Button("π Generate Image")
with gr.Column():
image_output = gr.Image(label="πΌοΈ Generated Image")
generate_button.click(fn=generate_image, inputs=[prompt_input, seed_input, resolution_input,
steps_input, guidance_input, gpu_toggle],
outputs=image_output)
demo.launch()
Key Notes
- Device Flexibility: The script defaults to GPU if available but falls back to CPU if toggled or no GPU is detected.
- Optimizations: GPU mode uses FP16, memory-efficient attention (via
xformers
), tiling, and attention slicing. - Mixed Precision: Uses
torch.autocast
on GPU;torch.no_grad
on CPU. - Optional
xformers
: Required for GPU memory optimization; install it if using CUDA.
Troubleshooting
Issue: ValueError: β Error: Hugging Face token not found!
Solution: Set the HUGGINGFACE_TOKEN
environment variable:
export HUGGINGFACE_TOKEN=your_huggingface_api_token
Issue: GPU not detected but expected
Solution:
- Check CUDA installation:
nvidia-smi
- Ensure PyTorch is installed with CUDA support:
pip list | grep torch
Issue: enable_xformers_memory_efficient_attention
fails
Solution: Install xformers
:
pip install xformers
Conclusion
This project delivers a flexible and efficient Stable Diffusion image generator, balancing GPU performance with CPU compatibility. Enjoy creating AI art with ease! π