DropletVideo: A Dataset and Approach to Explore Integral Spatio-Temporal Consistent Video Generation Paper • 2503.06053 • Published Mar 8 • 136
nvidia/Llama-Nemotron-Post-Training-Dataset Viewer • Updated about 22 hours ago • 3.91M • 5.24k • 405
Scaling Synthetic Data Creation with 1,000,000,000 Personas Paper • 2406.20094 • Published Jun 28, 2024 • 102
CoSTAast: Cost-Sensitive Toolpath Agent for Multi-turn Image Editing Paper • 2503.10613 • Published Mar 13 • 77
Token-Efficient Long Video Understanding for Multimodal LLMs Paper • 2503.04130 • Published Mar 6 • 93
AudioX: Diffusion Transformer for Anything-to-Audio Generation Paper • 2503.10522 • Published Mar 13 • 22
Aligning Multimodal LLM with Human Preference: A Survey Paper • 2503.14504 • Published 30 days ago • 22
Frac-Connections: Fractional Extension of Hyper-Connections Paper • 2503.14125 • Published about 1 month ago • 19