Spaces:
Runtime error
Runtime error
A newer version of the Gradio SDK is available:
6.1.0
metadata
title: ComfyUI-Style IPAdapter Generator
emoji: π¨
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.39.0
app_file: app.py
pinned: false
license: mit
π¨ ComfyUI-Style IPAdapter Generator
A Hugging Face Space that replicates core ComfyUI + IPAdapter functionality using Gradio. Generate images using text prompts and reference images with advanced AI models.
β¨ Features
- Text-to-Image Generation: Create images from detailed text descriptions
- IPAdapter Integration: Use reference images to guide generation (faces, styles, compositions)
- Multiple Models: Support for Stable Diffusion 1.5 and SDXL
- Advanced Controls: Fine-tune generation with guidance scale, steps, and resolution
- Face Enhancement: Optional CodeFormer/GFPGAN integration for face improvement
- LoRA Support: Apply custom style models for unique aesthetics
- Side-by-Side Comparison: View reference and generated images together
- Memory Optimized: Works on both CPU and GPU with automatic fallbacks
π Quick Start
Local Installation
Clone and Setup:
git clone <your-repo-url> cd comfyui-ipAdapter-space pip install -r requirements.txtRun the Application:
python app.pyAccess the Interface: Open your browser to
http://localhost:7860
Hugging Face Space Deployment
- Create a new Space on Hugging Face
- Upload files:
app.py,requirements.txt,README.md - Select hardware: CPU (free) or GPU (paid) based on your needs
- Deploy: The space will automatically build and launch
π Usage Guide
Basic Workflow
- Select Model: Choose between Stable Diffusion 1.5 or SDXL
- Enter Prompt: Describe the image you want to generate
- Upload Reference: Provide a reference image (face, style, or composition guide)
- Adjust Settings: Fine-tune generation parameters
- Generate: Click the generate button and wait for results
Parameters Explained
Core Settings
- Text Prompt: Detailed description of desired image
- Reference Image: Guide image for IPAdapter (faces work best)
- Model: Base diffusion model (SD 1.5 for speed, SDXL for quality)
Generation Controls
- Guidance Scale (1-20): How closely to follow the prompt (7.5 recommended)
- IPAdapter Scale (0-2): Strength of reference image influence (1.0 recommended)
- Resolution: Output image dimensions (512x512 for speed, higher for quality)
- Inference Steps (10-50): Quality vs speed tradeoff (20 recommended)
- Seed: For reproducible results (0 for random)
Enhancement Options
- Face Enhancement: Improve facial details in generated images
- CodeFormer vs GFPGAN: Different face enhancement algorithms
- LoRA Path: Local path to custom style models
- LoRA Scale: Strength of style model application
Best Practices
For Face Generation
- Use clear, well-lit reference photos
- Keep IPAdapter scale between 0.8-1.2
- Enable face enhancement for better results
- Use descriptive prompts: "professional headshot, studio lighting"
For Style Transfer
- Use artistic references (paintings, illustrations)
- Adjust IPAdapter scale based on desired style strength
- Experiment with different guidance scales
- Consider using LoRA models for consistent styles
Performance Optimization
- Use 512x512 resolution for faster generation
- Reduce inference steps to 15-20 for speed
- Enable face enhancement only when needed
- Use CPU mode if GPU memory is limited
π οΈ Technical Details
Architecture
- Frontend: Gradio web interface
- Backend: Hugging Face Diffusers + IPAdapter
- Models: Stable Diffusion 1.5/XL with IPAdapter weights
- Enhancement: CodeFormer/GFPGAN for face improvement
- Styling: LoRA support for custom aesthetics
Memory Management
- Automatic model loading/unloading
- GPU memory optimization with xformers
- CPU fallback for limited hardware
- Efficient attention mechanisms
Supported Formats
- Input Images: JPG, PNG, WebP
- Output: PNG format
- LoRA Models: .safetensors, .ckpt files
π§ Configuration
Environment Variables
# Optional: Set device preference
CUDA_VISIBLE_DEVICES=0
# Optional: Set cache directory
HF_HOME=/path/to/cache
Hardware Requirements
Minimum (CPU)
- 8GB RAM
- 10GB storage
- Generation time: 2-5 minutes
Recommended (GPU)
- NVIDIA GPU with 6GB+ VRAM
- 16GB RAM
- 20GB storage
- Generation time: 10-30 seconds
π Example Prompts
Portrait Generation
"A professional headshot photo of a person, studio lighting, high quality, detailed facial features"
Artistic Styles
"An oil painting portrait in the style of Renaissance masters, dramatic lighting, classical composition"
Fantasy/Sci-Fi
"A cyberpunk character with neon lighting, futuristic elements, digital art style"
Anime/Illustration
"An anime-style character portrait, vibrant colors, detailed eyes, manga illustration"
π Troubleshooting
Common Issues
Model Loading Errors
- Check internet connection for model downloads
- Ensure sufficient disk space (20GB+)
- Try switching to CPU mode if GPU memory insufficient
Generation Failures
- Verify reference image is valid (JPG/PNG)
- Check prompt length (keep under 200 characters)
- Reduce resolution if memory errors occur
Slow Performance
- Use smaller resolutions (512x512)
- Reduce inference steps
- Disable face enhancement
- Switch to CPU mode if GPU is overloaded
Face Enhancement Issues
- Ensure face is clearly visible in reference
- Try different enhancement algorithms
- Adjust IPAdapter scale for better face preservation
π€ Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Test thoroughly
- Submit a pull request
π License
This project is licensed under the MIT License. See LICENSE file for details.
π Acknowledgments
- Hugging Face for the Diffusers library and model hosting
- IPAdapter team for the reference image integration
- ComfyUI for inspiration and workflow concepts
- Gradio team for the excellent web interface framework
π Support
- Issues: Report bugs via GitHub Issues
- Discussions: Join the community discussions
- Documentation: Check the Hugging Face Spaces documentation
Note: This is an educational project replicating ComfyUI functionality. For production use, consider the original ComfyUI or commercial alternatives.