--- license: mit sdk: docker emoji: 🚀 colorFrom: blue colorTo: green pinned: false short_description: PicPilot Production Server --- # 🚀 PicPilot ![License](https://img.shields.io/badge/license-MIT-blue.svg) ![SDK](https://img.shields.io/badge/sdk-docker-blue.svg) ![Color](https://img.shields.io/badge/color-blue--green-brightgreen.svg) > PicPilot: Generate Stunning Photography and Craft Visual Narratives in seconds for your Brand ## 📖 Overview PicPilot is a scalable solution that leverages state-of-the-art Text to Image Models to extend and enhance images. This project has evolved through multiple iterations, addressing challenges and improving output quality at each stage. ### Key Features: - segmentation using Segment Anything VIT Huge and YOLOv8s - High-quality outpainting with Controlnet + ZoeDepth - stable video diffusion support - Batch API support and EventDriven Queue Support - Logging and Telemetry using LogFire ## 🏗 Architecture ![image](https://github.com/user-attachments/assets/2961f39b-f554-4c5e-8b62-3cdc30fff46d) Current Pipeline 1. **Object Detection**: YOLOv8l provides accurate bounding boxes 2. **Segmentation**: Segment Anything VIT Huge creates precise masks with ROI extension 3. **Outpainting**: Controlnet Zoe Depth + Realistic Vision XL 4. **I2V GenXL**: Image to Video Generation ## 🧠 Models used - [Stable Diffusion Inpainting](https://huggingface.co/runwayml/stable-diffusion-inpainting) - [Kandinsky 2.2 Decoder Inpaint](https://huggingface.co/kandinsky-community/kandinsky-2-2-decoder-inpaint) - [Stable Diffusion XL](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0) - [Controlnet-Inpaint Dreamer](https://huggingface.co/destitech/controlnet-inpaint-dreamer-sdxl) - [Controlnet Zoe Depth](https://huggingface.co/diffusers/controlnet-zoe-depth-sdxl-1.0) - [Realistic Vision XL](https://huggingface.co/OzzyGT/RealVisXL_V4.0_inpainting) - [I2V GenXL](https://huggingface.co/ali-vilab/i2vgen-xl) ## 📸 Results Here are some impressive results from our pipeline: