DAPO: An Open-Source LLM Reinforcement Learning System at Scale Paper • 2503.14476 • Published 14 days ago • 110
Towards Unified Latent Space for 3D Molecular Latent Diffusion Modeling Paper • 2503.15567 • Published 13 days ago • 6
RWKV-7 "Goose" with Expressive Dynamic State Evolution Paper • 2503.14456 • Published 14 days ago • 131
LLM4SR: A Survey on Large Language Models for Scientific Research Paper • 2501.04306 • Published Jan 8 • 36
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published Jan 8 • 272
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though Paper • 2501.04682 • Published Jan 8 • 95
Large Action Models: From Inception to Implementation Paper • 2412.10047 • Published Dec 13, 2024 • 34
O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson? Paper • 2411.16489 • Published Nov 25, 2024 • 48
Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization Paper • 2405.15071 • Published May 23, 2024 • 40
SOTOPIA-π: Interactive Learning of Socially Intelligent Language Agents Paper • 2403.08715 • Published Mar 13, 2024 • 21