DAPO: An Open-Source LLM Reinforcement Learning System at Scale Paper • 2503.14476 • Published 23 days ago • 116 • 5
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution Paper • 2502.18449 • Published Feb 25 • 73 • 5
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published Jan 22 • 379 • 5
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published Jan 8 • 275 • 42
Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published Dec 13, 2024 • 97 • 8
Training Large Language Models to Reason in a Continuous Latent Space Paper • 2412.06769 • Published Dec 9, 2024 • 82 • 7
Hymba: A Hybrid-head Architecture for Small Language Models Paper • 2411.13676 • Published Nov 20, 2024 • 45 • 3
LLaMA-Mesh: Unifying 3D Mesh Generation with Language Models Paper • 2411.09595 • Published Nov 14, 2024 • 76 • 4
TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters Paper • 2410.23168 • Published Oct 30, 2024 • 24 • 5
Writing in the Margins: Better Inference Pattern for Long Context Retrieval Paper • 2408.14906 • Published Aug 27, 2024 • 143
Writing in the Margins: Better Inference Pattern for Long Context Retrieval Paper • 2408.14906 • Published Aug 27, 2024 • 143 • 11
Mixture of Nested Experts: Adaptive Processing of Visual Tokens Paper • 2407.19985 • Published Jul 29, 2024 • 37 • 4
Mixture-of-Agents Enhances Large Language Model Capabilities Paper • 2406.04692 • Published Jun 7, 2024 • 60
Mixture-of-Agents Enhances Large Language Model Capabilities Paper • 2406.04692 • Published Jun 7, 2024 • 60 • 3