DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models Paper โข 2402.03300 โข Published Feb 5, 2024 โข 116
Executable Code Actions Elicit Better LLM Agents Paper โข 2402.01030 โข Published Feb 1, 2024 โข 125
LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers Paper โข 2502.15007 โข Published Feb 20 โข 174
Gold-medalist Performance in Solving Olympiad Geometry with AlphaGeometry2 Paper โข 2502.03544 โข Published Feb 5 โข 44
Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models Paper โข 2501.13629 โข Published Jan 23 โข 48
OpenDevin: An Open Platform for AI Software Developers as Generalist Agents Paper โข 2407.16741 โข Published Jul 23, 2024 โข 72
MEGABYTE: Predicting Million-byte Sequences with Multiscale Transformers Paper โข 2305.07185 โข Published May 12, 2023 โข 9