Spaces:

g8a9
/

ferret

Runtime error

elianap commited on Aug 1, 2022

Commit

bb4d707

1 Parent(s): ca954b1

Update corpus.py (#2)

- Update corpus.py (ea3c6860db669ae4c5b2ea183561c25ea2730492)

Co-authored-by: Eliana Pastor <elianap@users.noreply.huggingface.co>

Files changed (1) hide show

corpus.py CHANGED Viewed

@@ -118,12 +118,20 @@ def body():
             """
             **Legend**
             - **AOPC Comprehensiveness** (aopc_compr) measures *comprehensiveness*, i.e., if the explanation captures all the tokens needed to make the prediction. Higher is better.
             - **AOPC Sufficiency** (aopc_suff) measures *sufficiency*, i.e., if the relevant tokens in the explanation are sufficient to make the prediction. Lower is better.
             - **Leave-On-Out TAU Correlation** (taucorr_loo) measures the Kendall rank correlation coefficient τ between the explanation and leave-one-out importances. Closer to 1 is better.
             See the paper for details.
             """
         )

             """
             **Legend**
+             **Faithfulness**
             - **AOPC Comprehensiveness** (aopc_compr) measures *comprehensiveness*, i.e., if the explanation captures all the tokens needed to make the prediction. Higher is better.
             - **AOPC Sufficiency** (aopc_suff) measures *sufficiency*, i.e., if the relevant tokens in the explanation are sufficient to make the prediction. Lower is better.
             - **Leave-On-Out TAU Correlation** (taucorr_loo) measures the Kendall rank correlation coefficient τ between the explanation and leave-one-out importances. Closer to 1 is better.
+            **Plausibility**
+             - **AUPRC plausibility** (auprc_plau) is the area under the precision-recall curve (AUPRC) of the explanation and the rationale as ground truth. Higher is better.
+             - **Intersection-Over-Union (IOU)** (token_iou_plau) is the size of the overlap of the most relevant tokens of the explanation and the human rationale divided by the size of their union. Higher is better.
+             - **Token-level F1 score** (token_f1_plau) measures the F1 score among the most relevant tokens and the human rationale. Higher is better.
             See the paper for details.
             """
         )