neonsign
's Collections
Faster Diffusion: Rethinking the Role of UNet Encoder in Diffusion
Models
Paper
•
2312.09608
•
Published
•
13
CodeFusion: A Pre-trained Diffusion Model for Code Generation
Paper
•
2310.17680
•
Published
•
69
ZeroNVS: Zero-Shot 360-Degree View Synthesis from a Single Real Image
Paper
•
2310.17994
•
Published
•
8
Progressive Knowledge Distillation Of Stable Diffusion XL Using Layer
Level Loss
Paper
•
2401.02677
•
Published
•
21
PIXART-δ: Fast and Controllable Image Generation with Latent
Consistency Models
Paper
•
2401.05252
•
Published
•
45
InstantID: Zero-shot Identity-Preserving Generation in Seconds
Paper
•
2401.07519
•
Published
•
51
Towards A Better Metric for Text-to-Video Generation
Paper
•
2401.07781
•
Published
•
14
Quantum Denoising Diffusion Models
Paper
•
2401.07049
•
Published
•
12
SiT: Exploring Flow and Diffusion-based Generative Models with Scalable
Interpolant Transformers
Paper
•
2401.08740
•
Published
•
12
CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects
Paper
•
2401.09962
•
Published
•
7
DiffusionGPT: LLM-Driven Text-to-Image Generation System
Paper
•
2401.10061
•
Published
•
27
ImageDream: Image-Prompt Multi-view Diffusion for 3D Generation
Paper
•
2312.02201
•
Published
•
31
Clockwork Diffusion: Efficient Generation With Model-Step Distillation
Paper
•
2312.08128
•
Published
•
12
Mastering Text-to-Image Diffusion: Recaptioning, Planning, and
Generating with Multimodal LLMs
Paper
•
2401.11708
•
Published
•
29
Lumiere: A Space-Time Diffusion Model for Video Generation
Paper
•
2401.12945
•
Published
•
86
Large-scale Reinforcement Learning for Diffusion Models
Paper
•
2401.12244
•
Published
•
28
Diffuse to Choose: Enriching Image Conditioned Inpainting in Latent
Diffusion Models for Virtual Try-All
Paper
•
2401.13795
•
Published
•
65
Deconstructing Denoising Diffusion Models for Self-Supervised Learning
Paper
•
2401.14404
•
Published
•
16
BootPIG: Bootstrapping Zero-shot Personalized Image Generation
Capabilities in Pretrained Diffusion Models
Paper
•
2401.13974
•
Published
•
12
Transfer Learning for Text Diffusion Models
Paper
•
2401.17181
•
Published
•
14
Training-Free Consistent Text-to-Image Generation
Paper
•
2402.03286
•
Published
•
64
Paper
•
2402.03570
•
Published
•
7
λ-ECLIPSE: Multi-Concept Personalized Text-to-Image Diffusion
Models by Leveraging CLIP Latent Space
Paper
•
2402.05195
•
Published
•
18
Implicit Diffusion: Efficient Optimization through Stochastic Sampling
Paper
•
2402.05468
•
Published
•
5
Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation
Paper
•
2402.10210
•
Published
•
29
Paper
•
2402.09470
•
Published
•
9
DreamMatcher: Appearance Matching Self-Attention for
Semantically-Consistent Text-to-Image Personalization
Paper
•
2402.09812
•
Published
•
12
Make a Cheap Scaling: A Self-Cascade Diffusion Model for
Higher-Resolution Adaptation
Paper
•
2402.10491
•
Published
•
16
FiT: Flexible Vision Transformer for Diffusion Model
Paper
•
2402.12376
•
Published
•
48
DiLightNet: Fine-grained Lighting Control for Diffusion-based Image
Generation
Paper
•
2402.11929
•
Published
•
9
Paper
•
2402.13144
•
Published
•
94
MVDiffusion++: A Dense High-resolution Multi-view Diffusion Model for
Single or Sparse-view 3D Object Reconstruction
Paper
•
2402.12712
•
Published
•
15
SDXL-Lightning: Progressive Adversarial Diffusion Distillation
Paper
•
2402.13929
•
Published
•
27
T-Stitch: Accelerating Sampling in Pre-Trained Diffusion Models with
Trajectory Stitching
Paper
•
2402.14167
•
Published
•
10
Playground v2.5: Three Insights towards Enhancing Aesthetic Quality in
Text-to-Image Generation
Paper
•
2402.17245
•
Published
•
10
Trajectory Consistency Distillation
Paper
•
2402.19159
•
Published
•
14
DistriFusion: Distributed Parallel Inference for High-Resolution
Diffusion Models
Paper
•
2402.19481
•
Published
•
20
RealCustom: Narrowing Real Text Word for Real-Time Open-Domain
Text-to-Image Customization
Paper
•
2403.00483
•
Published
•
12
StableDrag: Stable Dragging for Point-based Image Editing
Paper
•
2403.04437
•
Published
•
25
PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K
Text-to-Image Generation
Paper
•
2403.04692
•
Published
•
40
Pix2Gif: Motion-Guided Diffusion for GIF Generation
Paper
•
2403.04634
•
Published
•
14
Fast High-Resolution Image Synthesis with Latent Adversarial Diffusion
Distillation
Paper
•
2403.12015
•
Published
•
64
AnimateDiff-Lightning: Cross-Model Diffusion Distillation
Paper
•
2403.12706
•
Published
•
17
Be Yourself: Bounded Attention for Multi-Subject Text-to-Image
Generation
Paper
•
2403.16990
•
Published
•
25
SDXS: Real-Time One-Step Latent Diffusion Models with Image Conditions
Paper
•
2403.16627
•
Published
•
20
FlexEdit: Flexible and Controllable Diffusion-based Object-centric Image
Editing
Paper
•
2403.18605
•
Published
•
7
Bigger is not Always Better: Scaling Properties of Latent Diffusion
Models
Paper
•
2404.01367
•
Published
•
20
On the Scalability of Diffusion-based Text-to-Image Generation
Paper
•
2404.02883
•
Published
•
17
InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image
Generation
Paper
•
2404.02733
•
Published
•
20
Cross-Attention Makes Inference Cumbersome in Text-to-Image Diffusion
Models
Paper
•
2404.02747
•
Published
•
11
Freditor: High-Fidelity and Transferable NeRF Editing by Frequency
Decomposition
Paper
•
2404.02514
•
Published
•
9
Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale
Prediction
Paper
•
2404.02905
•
Published
•
64
CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept
Matching
Paper
•
2404.03653
•
Published
•
33
ByteEdit: Boost, Comply and Accelerate Generative Image Editing
Paper
•
2404.04860
•
Published
•
24
UniFL: Improve Stable Diffusion via Unified Feedback Learning
Paper
•
2404.05595
•
Published
•
23
BeyondScene: Higher-Resolution Human-Centric Scene Generation With
Pretrained Diffusion
Paper
•
2404.04544
•
Published
•
20
Aligning Diffusion Models by Optimizing Human Utility
Paper
•
2404.04465
•
Published
•
13
Diffusion-RWKV: Scaling RWKV-Like Architectures for Diffusion Models
Paper
•
2404.04478
•
Published
•
12
SwapAnything: Enabling Arbitrary Object Swapping in Personalized Visual
Editing
Paper
•
2404.05717
•
Published
•
24
RealmDreamer: Text-Driven 3D Scene Generation with Inpainting and Depth
Diffusion
Paper
•
2404.07199
•
Published
•
25
Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse
Controls to Any Diffusion Model
Paper
•
2404.09967
•
Published
•
20
Long-form music generation with latent diffusion
Paper
•
2404.10301
•
Published
•
24
EdgeFusion: On-Device Text-to-Image Generation
Paper
•
2404.11925
•
Published
•
21
Hyper-SD: Trajectory Segmented Consistency Model for Efficient Image
Synthesis
Paper
•
2404.13686
•
Published
•
27
Align Your Steps: Optimizing Sampling Schedules in Diffusion Models
Paper
•
2404.14507
•
Published
•
21
Revisiting Text-to-Image Evaluation with Gecko: On Metrics, Prompts, and
Human Ratings
Paper
•
2404.16820
•
Published
•
15
Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation
Paper
•
2404.19752
•
Published
•
22
StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video
Generation
Paper
•
2405.01434
•
Published
•
51
Customizing Text-to-Image Models with a Single Image Pair
Paper
•
2405.01536
•
Published
•
18
Diffusion for World Modeling: Visual Details Matter in Atari
Paper
•
2405.12399
•
Published
•
27
EM Distillation for One-step Diffusion Models
Paper
•
2405.16852
•
Published
•
10
Kaleido Diffusion: Improving Conditional Diffusion Models with
Autoregressive Latent Modeling
Paper
•
2405.21048
•
Published
•
12
Step-aware Preference Optimization: Aligning Preference with Denoising
Performance at Each Step
Paper
•
2406.04314
•
Published
•
26
BitsFusion: 1.99 bits Weight Quantization of Diffusion Model
Paper
•
2406.04333
•
Published
•
36
MLCM: Multistep Consistency Distillation of Latent Diffusion Model
Paper
•
2406.05768
•
Published
•
8
AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising
Paper
•
2406.06911
•
Published
•
10
Interpreting the Weight Space of Customized Diffusion Models
Paper
•
2406.09413
•
Published
•
18
Alleviating Distortion in Image Generation via Multi-Resolution
Diffusion Models
Paper
•
2406.09416
•
Published
•
28
Make It Count: Text-to-Image Generation with an Accurate Number of
Objects
Paper
•
2406.10210
•
Published
•
76
Exploring the Role of Large Language Models in Prompt Encoding for
Diffusion Models
Paper
•
2406.11831
•
Published
•
19
Not All Prompts Are Made Equal: Prompt-based Pruning of Text-to-Image
Diffusion Models
Paper
•
2406.12042
•
Published
•
8
Immiscible Diffusion: Accelerating Diffusion Training with Noise
Assignment
Paper
•
2406.12303
•
Published
•
4
Invertible Consistency Distillation for Text-Guided Image Editing in
Around 7 Steps
Paper
•
2406.14539
•
Published
•
26
Repulsive Score Distillation for Diverse Sampling of Diffusion Models
Paper
•
2406.16683
•
Published
•
4
Aligning Diffusion Models with Noise-Conditioned Perception
Paper
•
2406.17636
•
Published
•
26
Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion
Paper
•
2407.01392
•
Published
•
39
RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models
Paper
•
2407.06938
•
Published
•
21
Video Diffusion Alignment via Reward Gradients
Paper
•
2407.08737
•
Published
•
47
MambaVision: A Hybrid Mamba-Transformer Vision Backbone
Paper
•
2407.08083
•
Published
•
27
Live2Diff: Live Stream Translation via Uni-directional Attention in
Video Diffusion Models
Paper
•
2407.08701
•
Published
•
10
DistilDIRE: A Small, Fast, Cheap and Lightweight Diffusion Synthesized
Deepfake Detection
Paper
•
2406.00856
•
Published
•
9
Diffree: Text-Guided Shape Free Object Inpainting with Diffusion Model
Paper
•
2407.16982
•
Published
•
40
BetterDepth: Plug-and-Play Diffusion Refiner for Zero-Shot Monocular
Depth Estimation
Paper
•
2407.17952
•
Published
•
29
Diffusion Feedback Helps CLIP See Better
Paper
•
2407.20171
•
Published
•
34
Diffusion Augmented Agents: A Framework for Efficient Exploration and
Transfer Learning
Paper
•
2407.20798
•
Published
•
23
Tora: Trajectory-oriented Diffusion Transformer for Video Generation
Paper
•
2407.21705
•
Published
•
25
TurboEdit: Text-Based Image Editing Using Few-Step Diffusion Models
Paper
•
2408.00735
•
Published
•
15
Smoothed Energy Guidance: Guiding Diffusion Models with Reduced Energy
Curvature of Attention
Paper
•
2408.00760
•
Published
•
5
ProCreate, Dont Reproduce! Propulsive Energy Diffusion for Creative
Generation
Paper
•
2408.02226
•
Published
•
10
An Object is Worth 64x64 Pixels: Generating 3D Object via Image
Diffusion
Paper
•
2408.03178
•
Published
•
36
Diffusion Models as Data Mining Tools
Paper
•
2408.02752
•
Published
•
13
Transformer Explainer: Interactive Learning of Text-Generative Models
Paper
•
2408.04619
•
Published
•
154
Img-Diff: Contrastive Data Synthesis for Multimodal Large Language
Models
Paper
•
2408.04594
•
Published
•
14
Make-An-Agent: A Generalizable Policy Network Generator with
Behavior-Prompted Diffusion
Paper
•
2407.10973
•
Published
•
9
Visual Text Generation in the Wild
Paper
•
2407.14138
•
Published
•
8
Paper
•
2408.07009
•
Published
•
61
DC3DO: Diffusion Classifier for 3D Objects
Paper
•
2408.06693
•
Published
•
10
Paper
•
2408.07116
•
Published
•
19