Open-Qwen2VL: Compute-Efficient Pre-Training of Fully-Open Multimodal LLMs on Academic Resources Paper • 2504.00595 • Published 5 days ago • 33
Exploring the Effect of Reinforcement Learning on Video Understanding: Insights from SEED-Bench-R1 Paper • 2503.24376 • Published 6 days ago • 35
Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation Paper • 2503.24379 • Published 6 days ago • 68
Think Before Recommend: Unleashing the Latent Reasoning Power for Sequential Recommendation Paper • 2503.22675 • Published 9 days ago • 32
MCTS-RAG: Enhancing Retrieval-Augmented Generation with Monte Carlo Tree Search Paper • 2503.20757 • Published 11 days ago • 9
BizGen: Advancing Article-level Visual Text Rendering for Infographics Generation Paper • 2503.20672 • Published 11 days ago • 13
Open Deep Search: Democratizing Search with Open-source Reasoning Agents Paper • 2503.20201 • Published 11 days ago • 42
ReaRAG: Knowledge-guided Reasoning Enhances Factuality of Large Reasoning Models with Iterative Retrieval Augmented Generation Paper • 2503.21729 • Published 10 days ago • 26
VBench-2.0: Advancing Video Generation Benchmark Suite for Intrinsic Faithfulness Paper • 2503.21755 • Published 10 days ago • 30
UI-R1: Enhancing Action Prediction of GUI Agents by Reinforcement Learning Paper • 2503.21620 • Published 10 days ago • 55