Update README.md
Browse files
README.md
CHANGED
|
@@ -1,4 +1,4 @@
|
|
| 1 |
-
<h1 align="center"> PyBench: Evaluate LLM Agent on Real World Tasks </h1>
|
| 2 |
|
| 3 |
<p align="center">
|
| 4 |
<a href="https://arxiv.org/abs/2407.16732">π Paper</a>
|
|
@@ -7,10 +7,11 @@
|
|
| 7 |
β’
|
| 8 |
<a href="https://huggingface.co/Mercury7353/PyLlama3" >π€ Model (PyLlama3)</a>
|
| 9 |
β’
|
| 10 |
-
<a href=" https://github.com/Mercury7353/PyBench" > Code </a>
|
| 11 |
β’
|
| 12 |
</p>
|
| 13 |
|
|
|
|
| 14 |
|
| 15 |
PyBench is a comprehensive benchmark evaluating LLM on real-world coding tasks including **chart analysis**, **text analysis**, **image/ audio editing**, **complex math** and **software/website development**.
|
| 16 |
We collect files from Kaggle, arXiv, and other sources and automatically generate queries according to the type and content of each file.
|
|
|
|
| 1 |
+
<h1 align="center"> PyBench: Evaluate LLM Agent on Real World Coding Tasks </h1>
|
| 2 |
|
| 3 |
<p align="center">
|
| 4 |
<a href="https://arxiv.org/abs/2407.16732">π Paper</a>
|
|
|
|
| 7 |
β’
|
| 8 |
<a href="https://huggingface.co/Mercury7353/PyLlama3" >π€ Model (PyLlama3)</a>
|
| 9 |
β’
|
| 10 |
+
<a href=" https://github.com/Mercury7353/PyBench" > πCode </a>
|
| 11 |
β’
|
| 12 |
</p>
|
| 13 |
|
| 14 |
+
This is the PyLlama3 model, fine-tuned for <a href=" https://github.com/Mercury7353/PyBench" > PyBench </a>.
|
| 15 |
|
| 16 |
PyBench is a comprehensive benchmark evaluating LLM on real-world coding tasks including **chart analysis**, **text analysis**, **image/ audio editing**, **complex math** and **software/website development**.
|
| 17 |
We collect files from Kaggle, arXiv, and other sources and automatically generate queries according to the type and content of each file.
|