GaussianAnything: Interactive Point Cloud Latent Diffusion for 3D Generation Paper • 2411.08033 • Published 16 days ago • 21
MagicQuill: An Intelligent Interactive Image Editing System Paper • 2411.09703 • Published 14 days ago • 56
Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models Paper • 2411.04996 • Published 21 days ago • 48
FrugalNeRF: Fast Convergence for Few-shot Novel View Synthesis without Learned Priors Paper • 2410.16271 • Published Oct 21 • 80
DreamVideo-2: Zero-Shot Subject-Driven Video Customization with Precise Motion Control Paper • 2410.13830 • Published Oct 17 • 23
Animate-X: Universal Character Image Animation with Enhanced Motion Representation Paper • 2410.10306 • Published Oct 14 • 53
MIMO: Controllable Character Video Synthesis with Spatial Decomposed Modeling Paper • 2409.16160 • Published Sep 24 • 32
Jamba-1.5 Collection The AI21 Jamba family of models are state-of-the-art, hybrid SSM-Transformer instruction following foundation models • 2 items • Updated Aug 22 • 82
Guide-and-Rescale: Self-Guidance Mechanism for Effective Tuning-Free Real Image Editing Paper • 2409.01322 • Published Sep 2 • 94
VisionTS: Visual Masked Autoencoders Are Free-Lunch Zero-Shot Time Series Forecasters Paper • 2408.17253 • Published Aug 30 • 36
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling Paper • 2408.16532 • Published Aug 29 • 47
TrackGo: A Flexible and Efficient Method for Controllable Video Generation Paper • 2408.11475 • Published Aug 21 • 17
PeriodWave: Multi-Period Flow Matching for High-Fidelity Waveform Generation Paper • 2408.07547 • Published Aug 14 • 7
Tora: Trajectory-oriented Diffusion Transformer for Video Generation Paper • 2407.21705 • Published Jul 31 • 25