MiniMax-01: Scaling Foundation Models with Lightning Attention Paper β’ 2501.08313 β’ Published Jan 14 β’ 274
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper β’ 2501.12948 β’ Published Jan 22 β’ 335
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining Paper β’ 2501.00958 β’ Published Jan 1 β’ 100
The Lessons of Developing Process Reward Models in Mathematical Reasoning Paper β’ 2501.07301 β’ Published Jan 13 β’ 92
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models Paper β’ 2501.03262 β’ Published Jan 4 β’ 90
Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training Paper β’ 2501.11425 β’ Published Jan 20 β’ 92
Search-o1: Agentic Search-Enhanced Large Reasoning Models Paper β’ 2501.05366 β’ Published Jan 9 β’ 95
OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis Paper β’ 2412.19723 β’ Published Dec 27, 2024 β’ 82
VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding Paper β’ 2501.13106 β’ Published Jan 22 β’ 83