Training Software Engineering Agents and Verifiers with SWE-Gym Paper • 2412.21139 • Published 19 days ago • 21
Law of the Weakest Link: Cross Capabilities of Large Language Models Paper • 2409.19951 • Published Sep 30, 2024 • 54
OpenDevin: An Open Platform for AI Software Developers as Generalist Agents Paper • 2407.16741 • Published Jul 23, 2024 • 70
Advancing LLM Reasoning Generalists with Preference Trees Paper • 2404.02078 • Published Apr 2, 2024 • 44
MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback Paper • 2309.10691 • Published Sep 19, 2023 • 4