Open-Qwen2VL: Compute-Efficient Pre-Training of Fully-Open Multimodal LLMs on Academic Resources Paper • 2504.00595 • Published 6 days ago • 33
Exploring the Effect of Reinforcement Learning on Video Understanding: Insights from SEED-Bench-R1 Paper • 2503.24376 • Published 7 days ago • 35
Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation Paper • 2503.24379 • Published 7 days ago • 68
Think Before Recommend: Unleashing the Latent Reasoning Power for Sequential Recommendation Paper • 2503.22675 • Published 10 days ago • 34
MCTS-RAG: Enhancing Retrieval-Augmented Generation with Monte Carlo Tree Search Paper • 2503.20757 • Published 12 days ago • 9
BizGen: Advancing Article-level Visual Text Rendering for Infographics Generation Paper • 2503.20672 • Published 12 days ago • 13
Open Deep Search: Democratizing Search with Open-source Reasoning Agents Paper • 2503.20201 • Published 13 days ago • 42
ReaRAG: Knowledge-guided Reasoning Enhances Factuality of Large Reasoning Models with Iterative Retrieval Augmented Generation Paper • 2503.21729 • Published 11 days ago • 27
VBench-2.0: Advancing Video Generation Benchmark Suite for Intrinsic Faithfulness Paper • 2503.21755 • Published 11 days ago • 31
UI-R1: Enhancing Action Prediction of GUI Agents by Reinforcement Learning Paper • 2503.21620 • Published 11 days ago • 56
φ-Decoding: Adaptive Foresight Sampling for Balanced Inference-Time Exploration and Exploitation Paper • 2503.13288 • Published 21 days ago • 49
Using Mechanistic Interpretability to Craft Adversarial Attacks against Large Language Models Paper • 2503.06269 • Published about 1 month ago • 4
From TOWER to SPIRE: Adding the Speech Modality to a Text-Only LLM Paper • 2503.10620 • Published 25 days ago • 6
ReCamMaster: Camera-Controlled Generative Rendering from A Single Video Paper • 2503.11647 • Published 24 days ago • 131
Quantization for OpenAI's Whisper Models: A Comparative Analysis Paper • 2503.09905 • Published 26 days ago • 6
Do I look like a `cat.n.01` to you? A Taxonomy Image Generation Benchmark Paper • 2503.10357 • Published 25 days ago • 11