# CLAUDE.md This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. ## Project Overview This is a Gradio application for image generation using the Qwen-Image model with Lightning LoRA acceleration. It's designed to run on Hugging Face Spaces with GPU support, providing fast 8-step image generation with advanced text rendering capabilities. ## Commands ### Run the application locally ```bash python app.py ``` ### Install dependencies ```bash pip install -r requirements.txt ``` ## Architecture ### Core Components 1. **Model Pipeline** (`app.py:130-164`) - Uses `Qwen/Qwen-Image` diffusion model with custom FlowMatchEulerDiscreteScheduler - Loads Lightning LoRA weights for 8-step acceleration - Configured for bfloat16 precision on CUDA 2. **Prompt Enhancement System** (`app.py:41-125`) - `polish_prompt()`: Uses Hugging Face InferenceClient with Cerebras provider to enhance prompts - `get_caption_language()`: Detects Chinese vs English prompts - `rewrite()`: Language-specific prompt enhancement with different system prompts for Chinese/English - Requires `HF_TOKEN` environment variable for API access 3. **Style Presets System** (`app.py:16-87`) - `load_style_presets()`: Loads style presets from `style_presets.yaml` - `apply_style_preset()`: Applies selected style to prompts - Supports custom styles and random style selection - Each preset includes prefix, suffix, and negative prompt components 4. **Page Layouts System** (`app.py:89-111`) - `load_page_layouts()`: Loads multi-image layouts from `page_layouts.yaml` - Supports 1-4 images per page with various layout configurations - Dynamic layout selection based on number of images 5. **PDF Generation** (`app.py:166-223`) - `create_pdf_with_layout()`: Creates PDF with multiple images in selected layout - Uses ReportLab for high-quality PDF generation - Preserves image quality at 95% JPEG compression - A4 page size with flexible positioning system 6. **Multi-Image Generation** (`app.py:225-307`) - `infer_multiple()`: Generates multiple images and combines into PDF - Progressive generation with status updates - Seed management for reproducibility across multiple images - Returns PDF file, preview image, and seed information 7. **Gradio Interface** (`app.py:380-500+`) - Slider for selecting 1-4 images per page - Dynamic layout dropdown that updates based on image count - Style preset dropdown with custom style text option - PDF download and image preview outputs - Advanced settings for all generation parameters ## Key Configuration - **Scheduler Config** (`app.py:133-148`): Custom configuration for FlowMatchEulerDiscreteScheduler with exponential time shifting - **Aspect Ratios** (`app.py:170-188`): Predefined aspect ratios optimized for 1024 base resolution - **Style Presets** (`style_presets.yaml`): Configurable style presets with prompt modifiers and negative prompts - **Page Layouts** (`page_layouts.yaml`): Flexible layout system for 1-4 images per page - **Default Settings**: 8 inference steps, guidance scale 1.0, prompt enhancement enabled, 1 image per page ## Environment Variables - `HF_TOKEN`: Required for prompt enhancement via Hugging Face InferenceClient - Used for accessing Cerebras provider for Qwen3-235B model ## Key Features - **Session-based storage**: Each user session gets a unique temporary directory that persists for 24 hours - **Multi-page PDF generation**: Users can generate up to 128 pages in a single document - **Dynamic page addition**: Click "Generate page N" to add the next page to the PDF - **Flexible layouts**: Different layout options for 1-4 images per page - **Style presets**: 20+ predefined artistic styles - **Automatic cleanup**: Old sessions are automatically cleaned after 24 hours ## Model Dependencies - Main model: `Qwen/Qwen-Image` - LoRA weights: `lightx2v/Qwen-Image-Lightning` (V1.1 safetensors) - Prompt enhancement model: `Qwen/Qwen3-235B-A22B-Instruct-2507` via Cerebras