SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering? Paper • 2502.12115 • Published 13 days ago • 42
TransMLA: Multi-head Latent Attention Is All You Need Paper • 2502.07864 • Published 19 days ago • 45
Ignore the KL Penalty! Boosting Exploration on Critical Tokens to Enhance RL Fine-Tuning Paper • 2502.06533 • Published 20 days ago • 18