Mask-DPO: Generalizable Fine-grained Factuality Alignment of LLMs Paper • 2503.02846 • Published 3 days ago • 18
Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning Paper • 2502.06781 • Published 25 days ago • 60
Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training Paper • 2501.11425 • Published Jan 20 • 92
CIBench: Evaluating Your LLMs with a Code Interpreter Plugin Paper • 2407.10499 • Published Jul 15, 2024
MindSearch: Mimicking Human Minds Elicits Deep AI Searcher Paper • 2407.20183 • Published Jul 29, 2024 • 42
MultiModal-GPT: A Vision and Language Model for Dialogue with Humans Paper • 2305.04790 • Published May 8, 2023 • 1
T-Eval: Evaluating the Tool Utilization Capability Step by Step Paper • 2312.14033 • Published Dec 21, 2023 • 2
InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning Paper • 2402.06332 • Published Feb 9, 2024 • 20
Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models Paper • 2403.12881 • Published Mar 19, 2024 • 17
AlchemistCoder: Harmonizing and Eliciting Code Capability by Hindsight Tuning on Multi-source Data Paper • 2405.19265 • Published May 29, 2024
MindSearch: Mimicking Human Minds Elicits Deep AI Searcher Paper • 2407.20183 • Published Jul 29, 2024 • 42
ShareGPT4Video: Improving Video Understanding and Generation with Better Captions Paper • 2406.04325 • Published Jun 6, 2024 • 73