DeepSeek LLM: Scaling Open-Source Language Models with Longtermism Paper • 2401.02954 • Published Jan 5 • 39
Perspectives on the State and Future of Deep Learning -- 2023 Paper • 2312.09323 • Published Dec 7, 2023 • 5
Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization Paper • 2405.15071 • Published May 23 • 33
Sibyl: Simple yet Effective Agent Framework for Complex Real-world Reasoning Paper • 2407.10718 • Published 7 days ago • 12
LAB-Bench: Measuring Capabilities of Language Models for Biology Research Paper • 2407.10362 • Published 8 days ago • 4