DropletVideo: A Dataset and Approach to Explore Integral Spatio-Temporal Consistent Video Generation Paper • 2503.06053 • Published Mar 8 • 138
Scaling Synthetic Data Creation with 1,000,000,000 Personas Paper • 2406.20094 • Published Jun 28, 2024 • 102
CoSTAast: Cost-Sensitive Toolpath Agent for Multi-turn Image Editing Paper • 2503.10613 • Published Mar 13 • 79
Token-Efficient Long Video Understanding for Multimodal LLMs Paper • 2503.04130 • Published Mar 6 • 94
AudioX: Diffusion Transformer for Anything-to-Audio Generation Paper • 2503.10522 • Published Mar 13 • 24
Frac-Connections: Fractional Extension of Hyper-Connections Paper • 2503.14125 • Published Mar 18 • 20