SciCode: A Research Coding Benchmark Curated by Scientists Paper • 2407.13168 • Published Jul 18, 2024 • 14
SWE-bench: Can Language Models Resolve Real-World GitHub Issues? Paper • 2310.06770 • Published Oct 10, 2023 • 4