-
ECLIPSE: A Resource-Efficient Text-to-Image Prior for Image Generations
Paper • 2312.04655 • Published • 19 -
FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition
Paper • 2312.07536 • Published • 15 -
Clockwork Diffusion: Efficient Generation With Model-Step Distillation
Paper • 2312.08128 • Published • 11 -
CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor
Paper • 2312.07661 • Published • 14
Collections
Discover the best community collections!
Collections including paper arxiv:2401.06105
-
StreamDiffusion: A Pipeline-level Solution for Real-time Interactive Generation
Paper • 2312.12491 • Published • 66 -
Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs
Paper • 2401.11708 • Published • 27 -
Training-Free Consistent Text-to-Image Generation
Paper • 2402.03286 • Published • 62 -
PALP: Prompt Aligned Personalization of Text-to-Image Models
Paper • 2401.06105 • Published • 46
-
DeepCache: Accelerating Diffusion Models for Free
Paper • 2312.00858 • Published • 20 -
HiFi Tuner: High-Fidelity Subject-Driven Fine-Tuning for Diffusion Models
Paper • 2312.00079 • Published • 14 -
Smooth Diffusion: Crafting Smooth Latent Spaces in Diffusion Models
Paper • 2312.04410 • Published • 14 -
SCEdit: Efficient and Controllable Image Diffusion Generation via Skip Connection Editing
Paper • 2312.11392 • Published • 18
-
DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation
Paper • 2208.12242 • Published • 7 -
IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion Models
Paper • 2308.06721 • Published • 24 -
h94/IP-Adapter-FaceID
Text-to-Image • Updated • 611k • 1.36k -
PALP: Prompt Aligned Personalization of Text-to-Image Models
Paper • 2401.06105 • Published • 46
-
One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning
Paper • 2306.07967 • Published • 23 -
Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation
Paper • 2306.07954 • Published • 111 -
TryOnDiffusion: A Tale of Two UNets
Paper • 2306.08276 • Published • 71 -
Seeing the World through Your Eyes
Paper • 2306.09348 • Published • 30
-
FreeU: Free Lunch in Diffusion U-Net
Paper • 2309.11497 • Published • 63 -
Concept Sliders: LoRA Adaptors for Precise Control in Diffusion Models
Paper • 2311.12092 • Published • 19 -
ZipLoRA: Any Subject in Any Style by Effectively Merging LoRAs
Paper • 2311.13600 • Published • 41 -
PALP: Prompt Aligned Personalization of Text-to-Image Models
Paper • 2401.06105 • Published • 46
-
A Picture is Worth a Thousand Words: Principled Recaptioning Improves Image Generation
Paper • 2310.16656 • Published • 37 -
CommonCanvas: An Open Diffusion Model Trained with Creative-Commons Images
Paper • 2310.16825 • Published • 28 -
Matryoshka Diffusion Models
Paper • 2310.15111 • Published • 39 -
I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models
Paper • 2311.04145 • Published • 30
-
DreamLLM: Synergistic Multimodal Comprehension and Creation
Paper • 2309.11499 • Published • 57 -
FoleyGen: Visually-Guided Audio Generation
Paper • 2309.10537 • Published • 6 -
Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V
Paper • 2310.11441 • Published • 25 -
The Chosen One: Consistent Characters in Text-to-Image Diffusion Models
Paper • 2311.10093 • Published • 54
-
Large Language Models as Optimizers
Paper • 2309.03409 • Published • 72 -
Natural Language Supervision for General-Purpose Audio Representations
Paper • 2309.05767 • Published • 7 -
Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers
Paper • 2309.08532 • Published • 50 -
AudioSR: Versatile Audio Super-resolution at Scale
Paper • 2309.07314 • Published • 23