Spaces:

jolieee206
/

ComfyUI-Style-IPAdapterGenerator

Runtime error

App Files Files Community

ComfyUI-Style-IPAdapterGenerator / README.md

jolieee206

Update README.md

04b9229 verified 5 months ago

preview code

raw

history blame contribute delete

6.6 kB

A newer version of the Gradio SDK is available: 6.1.0

Upgrade

metadata

title: ComfyUI-Style IPAdapter Generator
emoji: 🎨
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.39.0
app_file: app.py
pinned: false
license: mit

🎨 ComfyUI-Style IPAdapter Generator

A Hugging Face Space that replicates core ComfyUI + IPAdapter functionality using Gradio. Generate images using text prompts and reference images with advanced AI models.

✨ Features

Text-to-Image Generation: Create images from detailed text descriptions
IPAdapter Integration: Use reference images to guide generation (faces, styles, compositions)
Multiple Models: Support for Stable Diffusion 1.5 and SDXL
Advanced Controls: Fine-tune generation with guidance scale, steps, and resolution
Face Enhancement: Optional CodeFormer/GFPGAN integration for face improvement
LoRA Support: Apply custom style models for unique aesthetics
Side-by-Side Comparison: View reference and generated images together
Memory Optimized: Works on both CPU and GPU with automatic fallbacks

🚀 Quick Start

Local Installation

Clone and Setup:

git clone <your-repo-url>
cd comfyui-ipAdapter-space
pip install -r requirements.txt

Run the Application:
```
python app.py
```
Access the Interface: Open your browser to http://localhost:7860

Hugging Face Space Deployment

Create a new Space on Hugging Face
Upload files: app.py, requirements.txt, README.md
Select hardware: CPU (free) or GPU (paid) based on your needs
Deploy: The space will automatically build and launch

📖 Usage Guide

Basic Workflow

Select Model: Choose between Stable Diffusion 1.5 or SDXL
Enter Prompt: Describe the image you want to generate
Upload Reference: Provide a reference image (face, style, or composition guide)
Adjust Settings: Fine-tune generation parameters
Generate: Click the generate button and wait for results

Parameters Explained

Core Settings

Text Prompt: Detailed description of desired image
Reference Image: Guide image for IPAdapter (faces work best)
Model: Base diffusion model (SD 1.5 for speed, SDXL for quality)

Generation Controls

Guidance Scale (1-20): How closely to follow the prompt (7.5 recommended)
IPAdapter Scale (0-2): Strength of reference image influence (1.0 recommended)
Resolution: Output image dimensions (512x512 for speed, higher for quality)
Inference Steps (10-50): Quality vs speed tradeoff (20 recommended)
Seed: For reproducible results (0 for random)

Enhancement Options

Face Enhancement: Improve facial details in generated images
CodeFormer vs GFPGAN: Different face enhancement algorithms
LoRA Path: Local path to custom style models
LoRA Scale: Strength of style model application

Best Practices

For Face Generation

Use clear, well-lit reference photos
Keep IPAdapter scale between 0.8-1.2
Enable face enhancement for better results
Use descriptive prompts: "professional headshot, studio lighting"

For Style Transfer

Use artistic references (paintings, illustrations)
Adjust IPAdapter scale based on desired style strength
Experiment with different guidance scales
Consider using LoRA models for consistent styles

Performance Optimization

Use 512x512 resolution for faster generation
Reduce inference steps to 15-20 for speed
Enable face enhancement only when needed
Use CPU mode if GPU memory is limited

🛠️ Technical Details

Architecture

Frontend: Gradio web interface
Backend: Hugging Face Diffusers + IPAdapter
Models: Stable Diffusion 1.5/XL with IPAdapter weights
Enhancement: CodeFormer/GFPGAN for face improvement
Styling: LoRA support for custom aesthetics

Memory Management

Automatic model loading/unloading
GPU memory optimization with xformers
CPU fallback for limited hardware
Efficient attention mechanisms

Supported Formats

Input Images: JPG, PNG, WebP
Output: PNG format
LoRA Models: .safetensors, .ckpt files

🔧 Configuration

Environment Variables

# Optional: Set device preference
CUDA_VISIBLE_DEVICES=0

# Optional: Set cache directory
HF_HOME=/path/to/cache

Hardware Requirements

Minimum (CPU)

8GB RAM
10GB storage
Generation time: 2-5 minutes

Recommended (GPU)

NVIDIA GPU with 6GB+ VRAM
16GB RAM
20GB storage
Generation time: 10-30 seconds

📝 Example Prompts

Portrait Generation

"A professional headshot photo of a person, studio lighting, high quality, detailed facial features"

Artistic Styles

"An oil painting portrait in the style of Renaissance masters, dramatic lighting, classical composition"

Fantasy/Sci-Fi

"A cyberpunk character with neon lighting, futuristic elements, digital art style"

Anime/Illustration

"An anime-style character portrait, vibrant colors, detailed eyes, manga illustration"

🐛 Troubleshooting

Common Issues

Model Loading Errors

Check internet connection for model downloads
Ensure sufficient disk space (20GB+)
Try switching to CPU mode if GPU memory insufficient

Generation Failures

Verify reference image is valid (JPG/PNG)
Check prompt length (keep under 200 characters)
Reduce resolution if memory errors occur

Slow Performance

Use smaller resolutions (512x512)
Reduce inference steps
Disable face enhancement
Switch to CPU mode if GPU is overloaded

Face Enhancement Issues

Ensure face is clearly visible in reference
Try different enhancement algorithms
Adjust IPAdapter scale for better face preservation

🤝 Contributing

Fork the repository
Create a feature branch
Make your changes
Test thoroughly
Submit a pull request

📄 License

This project is licensed under the MIT License. See LICENSE file for details.

🙏 Acknowledgments

Hugging Face for the Diffusers library and model hosting
IPAdapter team for the reference image integration
ComfyUI for inspiration and workflow concepts
Gradio team for the excellent web interface framework

📞 Support

Issues: Report bugs via GitHub Issues
Discussions: Join the community discussions
Documentation: Check the Hugging Face Spaces documentation

Note: This is an educational project replicating ComfyUI functionality. For production use, consider the original ComfyUI or commercial alternatives.