Being-H0.5: Scaling Human-Centric Robot Learning for Cross-Embodiment Generalization Paper • 2601.12993 • Published 3 days ago • 68
METIS: Mentoring Engine for Thoughtful Inquiry & Solutions Paper • 2601.13075 • Published 3 days ago • 1
FantasyVLN: Unified Multimodal Chain-of-Thought Reasoning for Vision-Language Navigation Paper • 2601.13976 • Published 2 days ago • 10
LightOnOCR: A 1B End-to-End Multilingual Vision-Language Model for State-of-the-Art OCR Paper • 2601.14251 • Published 1 day ago • 8
OmniTransfer: All-in-one Framework for Spatio-temporal Video Transfer Paper • 2601.14250 • Published 1 day ago • 34
Toward Efficient Agents: Memory, Tool learning, and Planning Paper • 2601.14192 • Published 1 day ago • 33
FutureOmni: Evaluating Future Forecasting from Omni-Modal Context for Multimodal LLMs Paper • 2601.13836 • Published 2 days ago • 28
ToolPRMBench: Evaluating and Advancing Process Reward Models for Tool-using Agents Paper • 2601.12294 • Published 4 days ago • 14
MemoryRewardBench: Benchmarking Reward Models for Long-Term Memory Management in Large Language Models Paper • 2601.11969 • Published 5 days ago • 26
Aligning Agentic World Models via Knowledgeable Experience Learning Paper • 2601.13247 • Published 3 days ago • 13
UniX: Unifying Autoregression and Diffusion for Chest X-Ray Understanding and Generation Paper • 2601.11522 • Published 6 days ago • 17
Language of Thought Shapes Output Diversity in Large Language Models Paper • 2601.11227 • Published 6 days ago • 4
More Images, More Problems? A Controlled Analysis of VLM Failure Modes Paper • 2601.07812 • Published 10 days ago • 5
PersonalAlign: Hierarchical Implicit Intent Alignment for Personalized GUI Agent with Long-Term User-Centric Records Paper • 2601.09636 • Published 8 days ago • 6
BAPO: Boundary-Aware Policy Optimization for Reliable Agentic Search Paper • 2601.11037 • Published 6 days ago • 15