-
SwapAnything: Enabling Arbitrary Object Swapping in Personalized Visual Editing
Paper • 2404.05717 • Published • 24 -
ByteEdit: Boost, Comply and Accelerate Generative Image Editing
Paper • 2404.04860 • Published • 24 -
SDXS: Real-Time One-Step Latent Diffusion Models with Image Conditions
Paper • 2403.16627 • Published • 20 -
Personalized Face Inpainting with Diffusion Models by Parallel Visual Attention
Paper • 2312.03556 • Published • 1
Collections
Discover the best community collections!
Collections including paper arxiv:2404.04860
-
EdgeFusion: On-Device Text-to-Image Generation
Paper • 2404.11925 • Published • 21 -
Dynamic Typography: Bringing Words to Life
Paper • 2404.11614 • Published • 44 -
ControlNet++: Improving Conditional Controls with Efficient Consistency Feedback
Paper • 2404.07987 • Published • 47 -
Applying Guidance in a Limited Interval Improves Sample and Distribution Quality in Diffusion Models
Paper • 2404.07724 • Published • 13
-
Aligning Diffusion Models by Optimizing Human Utility
Paper • 2404.04465 • Published • 13 -
ByteEdit: Boost, Comply and Accelerate Generative Image Editing
Paper • 2404.04860 • Published • 24 -
TokenCompose: Grounding Diffusion with Token-level Supervision
Paper • 2312.03626 • Published • 5 -
Adding Conditional Control to Text-to-Image Diffusion Models
Paper • 2302.05543 • Published • 42
-
SwapAnything: Enabling Arbitrary Object Swapping in Personalized Visual Editing
Paper • 2404.05717 • Published • 24 -
ByteEdit: Boost, Comply and Accelerate Generative Image Editing
Paper • 2404.04860 • Published • 24 -
MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?
Paper • 2407.04842 • Published • 52
-
Improving Text-to-Image Consistency via Automatic Prompt Optimization
Paper • 2403.17804 • Published • 16 -
Be Yourself: Bounded Attention for Multi-Subject Text-to-Image Generation
Paper • 2403.16990 • Published • 25 -
Getting it Right: Improving Spatial Consistency in Text-to-Image Models
Paper • 2404.01197 • Published • 30 -
Condition-Aware Neural Network for Controlled Image Generation
Paper • 2404.01143 • Published • 11
-
How Far Are We from Intelligent Visual Deductive Reasoning?
Paper • 2403.04732 • Published • 19 -
MoAI: Mixture of All Intelligence for Large Language and Vision Models
Paper • 2403.07508 • Published • 74 -
DragAnything: Motion Control for Anything using Entity Representation
Paper • 2403.07420 • Published • 13 -
Learning and Leveraging World Models in Visual Representation Learning
Paper • 2403.00504 • Published • 31
-
RealCustom: Narrowing Real Text Word for Real-Time Open-Domain Text-to-Image Customization
Paper • 2403.00483 • Published • 13 -
OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on
Paper • 2403.01779 • Published • 28 -
Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers
Paper • 2401.11605 • Published • 22 -
FiT: Flexible Vision Transformer for Diffusion Model
Paper • 2402.12376 • Published • 48
-
Faster Diffusion: Rethinking the Role of UNet Encoder in Diffusion Models
Paper • 2312.09608 • Published • 13 -
CodeFusion: A Pre-trained Diffusion Model for Code Generation
Paper • 2310.17680 • Published • 70 -
ZeroNVS: Zero-Shot 360-Degree View Synthesis from a Single Real Image
Paper • 2310.17994 • Published • 8 -
Progressive Knowledge Distillation Of Stable Diffusion XL Using Layer Level Loss
Paper • 2401.02677 • Published • 22
-
One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning
Paper • 2306.07967 • Published • 24 -
Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation
Paper • 2306.07954 • Published • 112 -
TryOnDiffusion: A Tale of Two UNets
Paper • 2306.08276 • Published • 72 -
Seeing the World through Your Eyes
Paper • 2306.09348 • Published • 33