OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts Paper • 2503.22952 • Published 12 days ago • 18
Understanding R1-Zero-Like Training: A Critical Perspective Paper • 2503.20783 • Published 14 days ago • 35
When To Solve, When To Verify: Compute-Optimal Problem Solving and Generative Verification for LLM Reasoning Paper • 2504.01005 • Published 8 days ago • 15
Landscape of Thoughts: Visualizing the Reasoning Process of Large Language Models Paper • 2503.22165 • Published 13 days ago • 26
Recitation over Reasoning: How Cutting-Edge Language Models Can Fail on Elementary School-Level Reasoning Problems? Paper • 2504.00509 • Published 8 days ago • 21
MixerMDM: Learnable Composition of Human Motion Diffusion Models Paper • 2504.01019 • Published 8 days ago • 17
Expanding RL with Verifiable Rewards Across Diverse Domains Paper • 2503.23829 • Published 9 days ago • 18
TokenHSI: Unified Synthesis of Physical Human-Scene Interactions through Task Tokenization Paper • 2503.19901 • Published 15 days ago • 34
MergeVQ: A Unified Framework for Visual Generation and Representation with Disentangled Token Merging and Quantization Paper • 2504.00999 • Published 8 days ago • 77