Spaces:

SWE-Gym
/

README

Running

File size: 3,433 Bytes

---
title: README
emoji: 💻
colorFrom: indigo
colorTo: purple
sdk: static
pinned: false
---
Model/Data associated with Paper:

<h1 align="center"> Training Software Engineering Agents and Verifiers with SWE-Gym </h1>

<p align="center">
  <a href="https://www.jiayipan.com/" style="text-decoration: none;">Jiayi Pan<sup>*,1</sup></a>, 
  <a href="https://xwang.dev/" style="text-decoration: none;">Xingyao Wang<sup>*,2</sup></a>,
  <a href="https://www.phontron.com/" style="text-decoration: none;">Graham Neubig<sup>3</sup></a>,
  <a href="https://www.cs.toronto.edu/~ndjaitly/" style="text-decoration: none;">Navdeep Jaitly<sup>4</sup></a>,
  <a href="https://blender.cs.illinois.edu/hengji.html" style="text-decoration: none;">Heng Ji<sup>2</sup></a>,
  <a href="https://www.alanesuhr.com/" style="text-decoration: none;">Alane Suhr<sup>^,1</sup></a>,
  <a href="https://dreasysnail.github.io/" style="text-decoration: none;">Yizhe Zhang<sup>^,4</sup></a>
</p>

<p align="center">
  <sup>1</sup>UC Berkeley, <sup>2</sup>UIUC, <sup>3</sup>CMU, <sup>4</sup>Apple </br>
  <sub><sup>*</sup>Equal contribution, <sup>^</sup>Equal supervision</sub>
</p>

<p align="center">
<a href="https://github.com/SWE-Gym/SWE-Gym">💻 Code </a>
•
<a href="https://arxiv.org/abs/2412.21139">📃 Paper</a>
•
<a href="https://huggingface.co/SWE-Gym" >🤗 Data & Models</a>
</p>

We present **SWE-Gym**, the first environment for training real-world software engineering agents.
We use it to train strong LM agents that achieve state-of-the-art open results on SWE-Bench, with early, promising scaling characteristics as we increase training and inference-time compute.

<p align="center">
  <img src="https://github.com/SWE-Gym/SWE-Gym/raw/main/assets/images/teaser.jpg" width="100%" alt="teaser">
</p>


---

Progress in agents for software engineering has been limited by the lack of training environments that both include rigorous verification for reinforcement learning and cover the expansive tasks encountered in real-world repository-level engineering.

We introduce SWE-Gym: An Open Environment for Training Software Engineering Agents & Verifiers.
Our baselines achieve new open SOTA - 32%/26% on SWE-Bench Verified/Lite, with promising scaling trends.

![SWE-Gym Scaling](https://github.com/SWE-Gym/SWE-Gym/raw/main/assets/images/scaling.jpg)
*SWE-Gym enables scalable improvements for software engineering agents at both training and inference time. Our current results is primarity bottlenecked by training and inference compute, rather than the size of our environment.*

## Reproducing Results

Please refer to our [Github Repo](https://github.com/SWE-Gym/SWE-Gym) for more details: See [docs/OpenHands.md](https://github.com/SWE-Gym/SWE-Gym/tree/main/docs/OpenHands.md) and [docs/MoatlessTools.md](https://github.com/SWE-Gym/SWE-Gym/tree/main/docs/MoatlessTools.md) for instructions on reproducing results with our training and inference-time results for OpenHands and MoatlessTools agents.

## 📚 Citation

```bibtex
@misc{pan2024trainingsoftwareengineeringagents,
      title={Training Software Engineering Agents and Verifiers with SWE-Gym}, 
      author={Jiayi Pan and Xingyao Wang and Graham Neubig and Navdeep Jaitly and Heng Ji and Alane Suhr and Yizhe Zhang},
      year={2024},
      eprint={2412.21139},
      archivePrefix={arXiv},
      primaryClass={cs.SE},
      url={https://arxiv.org/abs/2412.21139}, 
}
```