2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining Paper • 2501.00958 • Published 12 days ago • 93
Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs Paper • 2412.21187 • Published 14 days ago • 34
INTERS: Unlocking the Power of Large Language Models in Search with Instruction Tuning Paper • 2401.06532 • Published Jan 12, 2024 • 12
Progressive Multimodal Reasoning via Active Retrieval Paper • 2412.14835 • Published 25 days ago • 72
Understand What LLM Needs: Dual Preference Alignment for Retrieval-Augmented Generation Paper • 2406.18676 • Published Jun 26, 2024 • 6
Smaller Language Models Are Better Instruction Evolvers Paper • 2412.11231 • Published 29 days ago • 27
IDArb: Intrinsic Decomposition for Arbitrary Number of Input Views and Illuminations Paper • 2412.12083 • Published 28 days ago • 12
RetroLLM: Empowering Large Language Models to Retrieve Fine-grained Evidence within Generation Paper • 2412.11919 • Published 28 days ago • 33
LLaVA-o1: Let Vision Language Models Reason Step-by-Step Paper • 2411.10440 • Published Nov 15, 2024 • 113
HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems Paper • 2411.02959 • Published Nov 5, 2024 • 66
CORAL: Benchmarking Multi-turn Conversational Retrieval-Augmentation Generation Paper • 2410.23090 • Published Oct 30, 2024 • 54
InternVL2.0 Collection Expanding Performance Boundaries of Open-Source MLLM • 15 items • Updated 4 days ago • 88
Toward General Instruction-Following Alignment for Retrieval-Augmented Generation Paper • 2410.09584 • Published Oct 12, 2024 • 47