Spaces:
Running
Running
Add description to card metadata
Browse filesThis metric implements the evaluation harness for the HumanEval problem solving dataset
described in the paper "Evaluating Large Language Models Trained on Code"
(https://arxiv.org/abs/2107.03374).
README.md
CHANGED
@@ -1,6 +1,6 @@
|
|
1 |
---
|
2 |
title: Code Eval
|
3 |
-
emoji: 🤗
|
4 |
colorFrom: blue
|
5 |
colorTo: red
|
6 |
sdk: gradio
|
@@ -8,10 +8,16 @@ sdk_version: 3.0.2
|
|
8 |
app_file: app.py
|
9 |
pinned: false
|
10 |
tags:
|
11 |
-
- evaluate
|
12 |
-
- metric
|
13 |
-
|
|
|
|
|
|
|
|
|
14 |
|
|
|
|
|
15 |
# Metric Card for Code Eval
|
16 |
|
17 |
## Metric description
|
|
|
1 |
---
|
2 |
title: Code Eval
|
3 |
+
emoji: 🤗
|
4 |
colorFrom: blue
|
5 |
colorTo: red
|
6 |
sdk: gradio
|
|
|
8 |
app_file: app.py
|
9 |
pinned: false
|
10 |
tags:
|
11 |
+
- evaluate
|
12 |
+
- metric
|
13 |
+
description: >-
|
14 |
+
This metric implements the evaluation harness for the HumanEval problem
|
15 |
+
solving dataset
|
16 |
+
|
17 |
+
described in the paper "Evaluating Large Language Models Trained on Code"
|
18 |
|
19 |
+
(https://arxiv.org/abs/2107.03374).
|
20 |
+
---
|
21 |
# Metric Card for Code Eval
|
22 |
|
23 |
## Metric description
|