Spaces:
Running
Running
Commit ·
62970d7
1
Parent(s): 24cf163
Add organization card
Browse files
README.md
CHANGED
|
@@ -1,10 +1,21 @@
|
|
| 1 |
---
|
| 2 |
title: README
|
| 3 |
-
emoji:
|
| 4 |
-
colorFrom:
|
| 5 |
-
colorTo:
|
| 6 |
sdk: static
|
| 7 |
pinned: false
|
| 8 |
---
|
| 9 |
|
| 10 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
title: README
|
| 3 |
+
emoji: 🦊
|
| 4 |
+
colorFrom: yellow
|
| 5 |
+
colorTo: red
|
| 6 |
sdk: static
|
| 7 |
pinned: false
|
| 8 |
---
|
| 9 |
|
| 10 |
+
# ProgramBench
|
| 11 |
+
|
| 12 |
+
**Can Language Models Rebuild Programs From Scratch?**
|
| 13 |
+
|
| 14 |
+
Given only a compiled binary and its documentation, AI agents must architect and implement a complete codebase that reproduces the original program's behavior. ProgramBench evaluates this capability across 200 real-world open-source projects spanning Rust, Go, C, C++, Haskell, and Java.
|
| 15 |
+
|
| 16 |
+
## Links
|
| 17 |
+
|
| 18 |
+
- [Website & Leaderboard](https://programbench.com)
|
| 19 |
+
- [Paper](https://programbench.com/static/paper.pdf)
|
| 20 |
+
- [GitHub](https://github.com/facebookresearch/ProgramBench)
|
| 21 |
+
- [Documentation](https://programbench.com/more)
|