Diffusion for World Modeling: Visual Details Matter in Atari Paper • 2405.12399 • Published 19 days ago • 25
StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation Paper • 2405.01434 • Published May 2 • 47
MOE papers to read Collection Copied from MoE using https://huggingface.co/spaces/librarian-bots/collection_cloner. • 82 items • Updated 10 days ago • 3
SliceGPT: Compress Large Language Models by Deleting Rows and Columns Paper • 2401.15024 • Published Jan 26 • 62
Be Yourself: Bounded Attention for Multi-Subject Text-to-Image Generation Paper • 2403.16990 • Published Mar 25 • 24
TextCraftor: Your Text Encoder Can be Image Quality Controller Paper • 2403.18978 • Published Mar 27 • 12
Sora Generates Videos with Stunning Geometrical Consistency Paper • 2402.17403 • Published Feb 27 • 15
Divide and Conquer: Language Models can Plan and Self-Correct for Compositional Text-to-Image Generation Paper • 2401.15688 • Published Jan 28 • 10
Agile But Safe: Learning Collision-Free High-Speed Legged Locomotion Paper • 2401.17583 • Published Jan 31 • 22
Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization Paper • 2402.03161 • Published Feb 5 • 13
CogCoM: Train Large Vision-Language Models Diving into Details through Chain of Manipulations Paper • 2402.04236 • Published Feb 6 • 6
Smooth Diffusion: Crafting Smooth Latent Spaces in Diffusion Models Paper • 2312.04410 • Published Dec 7, 2023 • 14
TokenFlow: Consistent Diffusion Features for Consistent Video Editing Paper • 2307.10373 • Published Jul 19, 2023 • 54