Spaces:
Running
Running
Update README.md
Browse files
README.md
CHANGED
|
@@ -7,11 +7,10 @@ sdk: static
|
|
| 7 |
pinned: false
|
| 8 |
---
|
| 9 |
|
| 10 |
-
# ProgramBench
|
| 11 |
|
| 12 |
-
|
| 13 |
-
|
| 14 |
-
Given only a compiled binary and its documentation, AI agents must architect and implement a complete codebase that reproduces the original program's behavior. ProgramBench evaluates this capability across 200 real-world open-source projects spanning Rust, Go, C, C++, Haskell, and Java.
|
| 15 |
|
| 16 |
## Links
|
| 17 |
|
|
|
|
| 7 |
pinned: false
|
| 8 |
---
|
| 9 |
|
| 10 |
+
# ProgramBench: Can Language Models Rebuild Programs From Scratch?
|
| 11 |
|
| 12 |
+
Given only a compiled binary and its documentation, AI agents must architect and implement a complete codebase that reproduces the original program's behavior.
|
| 13 |
+
ProgramBench evaluates this capability across 200 real-world open-source projects spanning Rust, Go, C, C++, Haskell, and Java.
|
|
|
|
| 14 |
|
| 15 |
## Links
|
| 16 |
|