World Modeling Makes a Better Planner: Dual Preference Optimization for Embodied Task Planning Paper • 2503.10480 • Published 5 days ago • 44
GTR: Guided Thought Reinforcement Prevents Thought Collapse in RL-based VLM Agent Training Paper • 2503.08525 • Published 7 days ago • 14
Query of CC: Unearthing Large Scale Domain-Specific Knowledge from Public Corpora Paper • 2401.14624 • Published Jan 26, 2024 • 1