Gen AI Diffusion - a Stalin16 Collection

Stalin16 's Collections

Gen AI Diffusion

Gen AI Diffusion

updated 1 day ago

Animate-X: Universal Character Image Animation with Enhanced Motion Representation

Paper • 2410.10306 • Published Oct 14 • 53
ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning

Paper • 2411.05003 • Published Nov 7 • 70
TIP-I2V: A Million-Scale Real Text and Image Prompt Dataset for Image-to-Video Generation

Paper • 2411.04709 • Published Nov 5 • 25
IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation

Paper • 2410.07171 • Published Oct 9 • 41
Story-Adapter: A Training-free Iterative Framework for Long Story Visualization

Paper • 2410.06244 • Published Oct 8 • 19
How Far is Video Generation from World Model: A Physical Law Perspective

Paper • 2411.02385 • Published Nov 4 • 33
Training-free Regional Prompting for Diffusion Transformers

Paper • 2411.02395 • Published Nov 4 • 25
AutoVFX: Physically Realistic Video Editing from Natural Language Instructions

Paper • 2411.02394 • Published Nov 4 • 17
Unpacking SDXL Turbo: Interpreting Text-to-Image Models with Sparse Autoencoders

Paper • 2410.22366 • Published Oct 28 • 75
HART: Efficient Visual Generation with Hybrid Autoregressive Transformer

Paper • 2410.10812 • Published Oct 14 • 15
DreamVideo-2: Zero-Shot Subject-Driven Video Customization with Precise Motion Control

Paper • 2410.13830 • Published Oct 17 • 23
SVDQunat: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models

Paper • 2411.05007 • Published Nov 7 • 16
Add-it: Training-Free Object Insertion in Images With Pretrained Diffusion Models

Paper • 2411.07232 • Published Nov 11 • 62
OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision

Paper • 2411.07199 • Published Nov 11 • 44
MagicQuill: An Intelligent Interactive Image Editing System

Paper • 2411.09703 • Published about 1 month ago • 57
AnimateAnything: Consistent and Controllable Animation for Video Generation

Paper • 2411.10836 • Published 29 days ago • 23
Stylecodes: Encoding Stylistic Information For Image Generation

Paper • 2411.12811 • Published 26 days ago • 11
VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement

Paper • 2411.15115 • Published 23 days ago • 9
Style-Friendly SNR Sampler for Style-Driven Generation

Paper • 2411.14793 • Published 23 days ago • 36
OminiControl: Minimal and Universal Control for Diffusion Transformer

Paper • 2411.15098 • Published 23 days ago • 51
OmniFlow: Any-to-Any Generation with Multi-Modal Rectified Flows

Paper • 2412.01169 • Published 13 days ago • 10
SNOOPI: Supercharged One-step Diffusion Distillation with Proper Guidance

Paper • 2412.02687 • Published 12 days ago • 108
NitroFusion: High-Fidelity Single-Step Diffusion through Dynamic Adversarial Training

Paper • 2412.02030 • Published 13 days ago • 17
MotionShop: Zero-Shot Motion Transfer in Video Diffusion Models with Mixture of Score Guidance

Paper • 2412.05355 • Published 9 days ago • 7
Infinity: Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis

Paper • 2412.04431 • Published 10 days ago • 16
DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation

Paper • 2412.07589 • Published 5 days ago • 41
FiVA: Fine-grained Visual Attribute Dataset for Text-to-Image Diffusion Models

Paper • 2412.07674 • Published 5 days ago • 20
UniReal: Universal Image Generation and Editing via Learning Real-world Dynamics

Paper • 2412.07774 • Published 5 days ago • 21
LoRA.rar: Learning to Merge LoRAs via Hypernetworks for Subject-Style Conditioned Image Generation

Paper • 2412.05148 • Published 9 days ago • 11
ObjCtrl-2.5D: Training-free Object Control with Camera Poses

Paper • 2412.07721 • Published 5 days ago • 8
StyleMaster: Stylize Your Video with Artistic Generation and Translation

Paper • 2412.07744 • Published 5 days ago • 16
FlowEdit: Inversion-Free Text-Based Editing Using Pre-Trained Flow Models

Paper • 2412.08629 • Published 4 days ago • 10
DisPose: Disentangling Pose Guidance for Controllable Human Image Animation

Paper • 2412.09349 • Published 3 days ago • 5
LAION-SG: An Enhanced Large-Scale Dataset for Training Complex Image-Text Models with Structural Annotations

Paper • 2412.08580 • Published 4 days ago • 37
EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM

Paper • 2412.09618 • Published 3 days ago • 20
LoRACLR: Contrastive Adaptation for Customization of Diffusion Models

Paper • 2412.09622 • Published 3 days ago • 7