visual-ai / README.md
bniladridas's picture
Update README.md
165bc21 verified

A newer version of the Gradio SDK is available: 5.27.0

Upgrade
metadata
title: Visual Ai
emoji: πŸ–Ό
colorFrom: purple
colorTo: red
sdk: gradio
sdk_version: 5.20.1
app_file: app.py
pinned: false
license: mit
short_description: What you wish to see in the output image.

Stable Diffusion Image Generator

Overview

This project provides a Stable Diffusion image generator powered by the stabilityai/stable-diffusion-2-1 model. It’s optimized for GPU execution with CUDA but includes a CPU fallback option, allowing flexibility based on hardware availability. The application uses the diffusers library and a gradio-based UI for interactive image generation.

Features

  • Runs on GPU (CUDA) with FP16 precision and memory optimizations or CPU with FP32 precision.
  • Customizable parameters: prompt, resolution, seed, inference steps, and guidance scale.
  • Toggle between GPU and CPU execution via the UI.
  • Built-in performance optimizations for GPU (e.g., memory-efficient attention, tiling).

Prerequisites

  • Python 3.8+
  • A CUDA-compatible GPU (optional but recommended for performance).
  • A Hugging Face account and API token for model access.

Required Dependencies

  • torch (with CUDA support for GPU usage)
  • diffusers (for the Stable Diffusion pipeline)
  • gradio (for the UI)
  • huggingface_hub (for authentication)
  • xformers (optional, for GPU memory optimization)
  • transformers (transitive dependency of diffusers)

Install Dependencies

For GPU support (adjust PyTorch CUDA version as needed):

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
pip install diffusers[torch] gradio huggingface_hub transformers
pip install xformers  # Optional, for GPU memory optimization

For CPU-only:

pip install torch torchvision torchaudio
pip install diffusers[torch] gradio huggingface_hub transformers

Environment Setup

Set your Hugging Face API token as an environment variable:

export HUGGINGFACE_TOKEN=your_huggingface_api_token

Run the Application

python app.py

This launches a Gradio UI where you can input parameters and generate images.

Code Implementation

The pipeline dynamically selects the device (cuda or cpu) based on availability and user preference. Here’s a summary of the implementation:

import torch
from diffusers import StableDiffusionPipeline
import gradio as gr
import os
import time
import logging
from huggingface_hub import login

# Logging setup
logging.basicConfig(level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s")

# Load and authenticate with Hugging Face token
hf_token = os.getenv("HUGGINGFACE_TOKEN")
if not hf_token:
    raise ValueError("❌ Error: Hugging Face token not found!")
login(token=hf_token)

# Model setup
model_id = "stabilityai/stable-diffusion-2-1"
device = "cuda" if torch.cuda.is_available() else "cpu"
torch_dtype = torch.float16 if device == "cuda" else torch.float32

pipe = StableDiffusionPipeline.from_pretrained(
    model_id,
    torch_dtype=torch_dtype,
    revision="fp16" if device == "cuda" else None,
    use_auth_token=hf_token
)

# GPU optimizations (if applicable)
if device == "cuda":
    pipe.to("cuda")
    pipe.enable_xformers_memory_efficient_attention()
    pipe.vae.enable_tiling()
    pipe.enable_attention_slicing()
    torch.backends.cuda.matmul.allow_tf32 = True

logging.info(f"πŸš€ Running on: {device.upper()} with {torch_dtype}")

# Image generation function
def generate_image(prompt, seed, resolution, steps, guidance, use_gpu):
    device = "cuda" if use_gpu and torch.cuda.is_available() else "cpu"
    pipe.to(device)
    width, height = map(int, resolution.split("x"))
    generator = torch.Generator(device).manual_seed(int(seed)) if seed != "-1" else None

    with torch.autocast("cuda") if device == "cuda" else torch.no_grad():
        image = pipe(prompt, num_inference_steps=int(steps), guidance_scale=float(guidance),
                     generator=generator, width=width, height=height).images[0]
    return image

# Gradio UI setup
with gr.Blocks() as demo:
    gr.Markdown("# πŸ–ŒοΈ **Stable Diffusion Image Generator**")
    with gr.Row():
        with gr.Column():
            prompt_input = gr.Textbox(label="🎨 Prompt")
            resolution_input = gr.Textbox(label="πŸ“ Resolution", value="512x512")
            seed_input = gr.Textbox(label="πŸ”’ Seed (-1 for random)", value="-1")
            steps_input = gr.Slider(10, 50, value=30, label="πŸ› οΈ Inference Steps")
            guidance_input = gr.Slider(1.0, 15.0, value=7.5, label="πŸŽ›οΈ Guidance Scale")
            gpu_toggle = gr.Checkbox(label="⚑ Use GPU (if available)", value=True)
            generate_button = gr.Button("πŸš€ Generate Image")
        with gr.Column():
            image_output = gr.Image(label="πŸ–ΌοΈ Generated Image")
    generate_button.click(fn=generate_image, inputs=[prompt_input, seed_input, resolution_input,
                                                    steps_input, guidance_input, gpu_toggle],
                          outputs=image_output)

demo.launch()

Key Notes

  • Device Flexibility: The script defaults to GPU if available but falls back to CPU if toggled or no GPU is detected.
  • Optimizations: GPU mode uses FP16, memory-efficient attention (via xformers), tiling, and attention slicing.
  • Mixed Precision: Uses torch.autocast on GPU; torch.no_grad on CPU.
  • Optional xformers: Required for GPU memory optimization; install it if using CUDA.

Troubleshooting

Issue: ValueError: ❌ Error: Hugging Face token not found!

Solution: Set the HUGGINGFACE_TOKEN environment variable:

export HUGGINGFACE_TOKEN=your_huggingface_api_token

Issue: GPU not detected but expected

Solution:

  • Check CUDA installation: nvidia-smi
  • Ensure PyTorch is installed with CUDA support: pip list | grep torch

Issue: enable_xformers_memory_efficient_attention fails

Solution: Install xformers:

pip install xformers

Conclusion

This project delivers a flexible and efficient Stable Diffusion image generator, balancing GPU performance with CPU compatibility. Enjoy creating AI art with ease! πŸš€