Trajectory Attention for Fine-grained Video Motion Control Paper • 2411.19324 • Published 7 days ago • 12
Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model Paper • 2411.19108 • Published 7 days ago • 15
On Domain-Specific Post-Training for Multimodal Large Language Models Paper • 2411.19930 • Published 6 days ago • 23
Beyond Examples: High-level Automated Reasoning Paradigm in In-Context Learning via MCTS Paper • 2411.18478 • Published 8 days ago • 26
OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding Paper • 2406.19389 • Published Jun 27 • 52