RoboFactory: Exploring Embodied Agent Collaboration with Compositional Constraints Paper • 2503.16408 • Published 14 days ago • 39
Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning Paper • 2503.15558 • Published 16 days ago • 44
DeepMesh: Auto-Regressive Artist-mesh Creation with Reinforcement Learning Paper • 2503.15265 • Published 16 days ago • 44
DAPO: An Open-Source LLM Reinforcement Learning System at Scale Paper • 2503.14476 • Published 16 days ago • 112
RWKV-7 "Goose" with Expressive Dynamic State Evolution Paper • 2503.14456 • Published 16 days ago • 130
ReCamMaster: Camera-Controlled Generative Rendering from A Single Video Paper • 2503.11647 • Published 20 days ago • 125
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models Paper • 2503.09573 • Published 22 days ago • 67
Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia Paper • 2503.07920 • Published 24 days ago • 95
Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders Paper • 2503.03601 • Published 29 days ago • 222
BEHAVIOR Robot Suite: Streamlining Real-World Whole-Body Manipulation for Everyday Household Activities Paper • 2503.05652 • Published 27 days ago • 10
R1-Omni: Explainable Omni-Multimodal Emotion Recognition with Reinforcing Learning Paper • 2503.05379 • Published 28 days ago • 33
R1-Zero's "Aha Moment" in Visual Reasoning on a 2B Non-SFT Model Paper • 2503.05132 • Published 28 days ago • 52
Unified Reward Model for Multimodal Understanding and Generation Paper • 2503.05236 • Published 28 days ago • 112
Token-Efficient Long Video Understanding for Multimodal LLMs Paper • 2503.04130 • Published 29 days ago • 89
Babel: Open Multilingual Large Language Models Serving Over 90% of Global Speakers Paper • 2503.00865 • Published Mar 2 • 61