A Picture is Worth a Thousand Words: Principled Recaptioning Improves
Image Generation
Paper
•
2310.16656
•
Published
•
40
CommonCanvas: An Open Diffusion Model Trained with Creative-Commons
Images
Paper
•
2310.16825
•
Published
•
32
Matryoshka Diffusion Models
Paper
•
2310.15111
•
Published
•
40
I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion
Models
Paper
•
2311.04145
•
Published
•
32
LCM-LoRA: A Universal Stable-Diffusion Acceleration Module
Paper
•
2311.05556
•
Published
•
81
The Chosen One: Consistent Characters in Text-to-Image Diffusion Models
Paper
•
2311.10093
•
Published
•
57
AutoStory: Generating Diverse Storytelling Images with Minimal Human
Effort
Paper
•
2311.11243
•
Published
•
14
NeuroPrompts: An Adaptive Framework to Optimize Prompts for
Text-to-Image Generation
Paper
•
2311.12229
•
Published
•
26
MagicDance: Realistic Human Dance Video Generation with Motions & Facial
Expressions Transfer
Paper
•
2311.12052
•
Published
•
32
GPT4Motion: Scripting Physical Motions in Text-to-Video Generation via
Blender-Oriented GPT Planning
Paper
•
2311.12631
•
Published
•
13
VideoBooth: Diffusion-based Video Generation with Image Prompts
Paper
•
2312.00777
•
Published
•
21
DreamVideo: Composing Your Dream Videos with Customized Subject and
Motion
Paper
•
2312.04433
•
Published
•
9
Clockwork Diffusion: Efficient Generation With Model-Step Distillation
Paper
•
2312.08128
•
Published
•
12
StreamDiffusion: A Pipeline-level Solution for Real-time Interactive
Generation
Paper
•
2312.12491
•
Published
•
69
DreamTuner: Single Image is Enough for Subject-Driven Generation
Paper
•
2312.13691
•
Published
•
26
I2V-Adapter: A General Image-to-Video Adapter for Video Diffusion Models
Paper
•
2312.16693
•
Published
•
13
VideoDrafter: Content-Consistent Multi-Scene Video Generation with LLM
Paper
•
2401.01256
•
Published
•
19
Improving Diffusion-Based Image Synthesis with Context Prediction
Paper
•
2401.02015
•
Published
•
6
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Paper
•
2401.02954
•
Published
•
41
Parrot: Pareto-optimal Multi-Reward Reinforcement Learning Framework for
Text-to-Image Generation
Paper
•
2401.05675
•
Published
•
22
Object-Centric Diffusion for Efficient Video Editing
Paper
•
2401.05735
•
Published
•
7
PALP: Prompt Aligned Personalization of Text-to-Image Models
Paper
•
2401.06105
•
Published
•
47
Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass
Diffusion Transformers
Paper
•
2401.11605
•
Published
•
22
Learning Continuous 3D Words for Text-to-Image Generation
Paper
•
2402.08654
•
Published
•
10
PRDP: Proximal Reward Difference Prediction for Large-Scale Reward
Finetuning of Diffusion Models
Paper
•
2402.08714
•
Published
•
11
FiT: Flexible Vision Transformer for Diffusion Model
Paper
•
2402.12376
•
Published
•
48
Paper
•
2402.13144
•
Published
•
95
DistriFusion: Distributed Parallel Inference for High-Resolution
Diffusion Models
Paper
•
2402.19481
•
Published
•
20
StreamMultiDiffusion: Real-Time Interactive Generation with Region-Based
Semantic Control
Paper
•
2403.09055
•
Published
•
24
CosmoCLIP: Generalizing Large Vision-Language Models for Astronomical
Imaging
Paper
•
2407.07315
•
Published
•
6