Spaces:

cpllab
/

syntaxgym

Sleeping

jgauthier commited on Jul 8, 2022

Commit

ea234b8

1 Parent(s): e00b8f2

document accuracy property

Files changed (1) hide show

README.md CHANGED Viewed

@@ -105,9 +105,10 @@ overall_accuracy = np.mean(list(suite_accuracies.values()))
 ### Output Values
-The metric returns a dict of `SyntaxGymMetricSuiteResult` tuples, mapping test suite names to test suite performance. Each inner dict has two entries:
-- **prediction_results** (`List[List[bool]]`): For each item in the test suite, a list of booleans indicating whether each corresponding prediction came out `True`. Typically these are combined to yield an accuracy score (see example usage above).
 - **region_totals** (`List[Dict[Tuple[str, int], float]`): For each item, a mapping from individual region (keys `(<condition_name>, <region_number>)`) to the float-valued total surprisal for tokens in this region. This is useful for visualization, or if you'd like to use the aggregate surprisal data for other tasks (e.g. reading time prediction or neural activity prediction).
 ```python

 ### Output Values
+The metric returns a dict of `SyntaxGymMetricSuiteResult` objects, mapping test suite names to test suite performance. Each inner object has three properties:
+- **accuracy** (`float`): Model accuracy on this suite. This is the accuracy of the conjunction of all boolean predictions per item in the suite.
+- **prediction_results** (`List[List[bool]]`): For each item in the test suite, a list of booleans indicating whether each corresponding prediction came out `True`. Typically these are combined to yield an accuracy score (but you can simply use the `accuracy` property).
 - **region_totals** (`List[Dict[Tuple[str, int], float]`): For each item, a mapping from individual region (keys `(<condition_name>, <region_number>)`) to the float-valued total surprisal for tokens in this region. This is useful for visualization, or if you'd like to use the aggregate surprisal data for other tasks (e.g. reading time prediction or neural activity prediction).
 ```python