Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme Paper • 2504.02587 • Published 13 days ago • 30
PaperBench: Evaluating AI's Ability to Replicate AI Research Paper • 2504.01848 • Published 14 days ago • 34
Exploring the Effect of Reinforcement Learning on Video Understanding: Insights from SEED-Bench-R1 Paper • 2503.24376 • Published 16 days ago • 37
Expanding RL with Verifiable Rewards Across Diverse Domains Paper • 2503.23829 • Published 16 days ago • 18
What, How, Where, and How Well? A Survey on Test-Time Scaling in Large Language Models Paper • 2503.24235 • Published 16 days ago • 52
What, How, Where, and How Well? A Survey on Test-Time Scaling in Large Language Models Paper • 2503.24235 • Published 16 days ago • 52
CLS-RL: Image Classification with Rule-Based Reinforcement Learning Paper • 2503.16188 • Published 27 days ago • 9
Fin-R1: A Large Language Model for Financial Reasoning through Reinforcement Learning Paper • 2503.16252 • Published 27 days ago • 27
DAPO: An Open-Source LLM Reinforcement Learning System at Scale Paper • 2503.14476 • Published 29 days ago • 117
R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization Paper • 2503.12937 • Published about 1 month ago • 27
VisualPRM: An Effective Process Reward Model for Multimodal Reasoning Paper • 2503.10291 • Published Mar 13 • 34
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning Paper • 2503.09516 • Published Mar 12 • 27
Light-R1: Curriculum SFT, DPO and RL for Long COT from Scratch and Beyond Paper • 2503.10460 • Published Mar 13 • 27
Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning Paper • 2503.07572 • Published Mar 10 • 41
LMM-R1: Empowering 3B LMMs with Strong Reasoning Abilities Through Two-Stage Rule-Based RL Paper • 2503.07536 • Published Mar 10 • 84
MM-Eureka: Exploring Visual Aha Moment with Rule-based Large-scale Reinforcement Learning Paper • 2503.07365 • Published Mar 10 • 56
Big-Math Collection This collection contains assets associated with the Big-Math dataset, a high-quality collection of over 250,000 math questions with verifiable answers • 4 items • Updated about 15 hours ago • 4