Spaces:

giulio98
/

code_eval_outputs

Runtime error

giulio98 commited on Nov 9, 2022

Commit

9433903

•

1 Parent(s): 36cc098

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -28,7 +28,7 @@ The Code Eval metric calculates how good are predictions given a set of referenc
 `references`: a list with a **function call** for each prediction. Each **function call** should output a string in stdout.
-`outputs`: a list of the expected output for each prediction.
 `k`: number of code candidates to consider in the evaluation. The default value is `[1, 10, 100]`.
@@ -39,10 +39,28 @@ The Code Eval metric calculates how good are predictions given a set of referenc
 ```python
 from evaluate import load
 code_eval_outputs = load("giulio98/code_eval_outputs")
-references = ["if __name__ == "__main__":\n    print(add(2, 3))"]
 expected_outputs = ["5"]
 candidates = [["def add(a,b):\n    return a*b", "def add(a, b):\n    return a+b"]]
-pass_at_k, results = code_eval_outputs.compute(references=references, predictions=candidates, outputs=expected_outputs, k=[1, 2])
 ```
 N.B.

 `references`: a list with a **function call** for each prediction. Each **function call** should output a string in stdout.
+`output`: a list of the expected output for each prediction.
 `k`: number of code candidates to consider in the evaluation. The default value is `[1, 10, 100]`.
 ```python
 from evaluate import load
 code_eval_outputs = load("giulio98/code_eval_outputs")
+references = ["if __name__ == \"__main__\":\n    print(add(2, 3))"]
 expected_outputs = ["5"]
 candidates = [["def add(a,b):\n    return a*b", "def add(a, b):\n    return a+b"]]
+pass_at_k, results = code_eval_outputs.compute(references=references, predictions=candidates, output=expected_outputs, k=[1, 2])
+print(pass_at_k)
+print(results)
+```
+Output:
+```python
+{'pass@1': 0.5, 'pass@2': 1.0}
+defaultdict(list,
+            {0: [(0,
+               {'task_id': 0,
+                'passed': False,
+                'result': 'not passed',
+                'completion_id': 0}),
+              (1,
+               {'task_id': 0,
+                'passed': True,
+                'result': 'passed',
+                'completion_id': 1})]})
 ```
 N.B.