Spaces:

MarioBarbeque
/

CombinedEvaluationMetrics

Sleeping

App Files Files Community

John Graham Reynolds commited on Nov 5

Commit

dbc5be0

•

1 Parent(s): 217c111

update conditional

Browse files

Files changed (1) hide show

app.py +20 -10

app.py CHANGED Viewed

@@ -15,26 +15,36 @@ Check out the original, longstanding issue [here](https://github.com/huggingface
 evaluation. I encountered this issue specifically while training [RoBERTa-base-DReiFT](https://huggingface.co/MarioBarbeque/RoBERTa-base-DReiFT) for multilabel \
 text classification of 805 labeled medical conditions based on drug reviews. \n
-This Space shows how one can instantiate these custom metrics each with their own unique methodology for averaging across labels, combine them into a single
-HF `evaluate.EvaluationModule` (or `Metric`), and compute them.</p>
 """
-article = "<p style='text-align: center'>Check out the [original repo](https://github.com/johngrahamreynolds/FixedMetricsForHF) housing this code, and a quickly \
-trained [multilabel text classification model](https://github.com/johngrahamreynolds/RoBERTa-base-DReiFT/tree/main) that makes use of it during evaluation.</p>"
-def evaluation(predictions, metrics) -> str:
-    f1 = FixedF1(average=metrics.loc[metrics["Metric"] == "f1"]["Averaging Type"][0])
-    precision = FixedPrecision(average=metrics.loc[metrics["Metric"] == "precision"]["Averaging Type"][0])
-    recall = FixedRecall(average=metrics.loc[metrics["Metric"] == "recall"]["Averaging Type"][0])
-    combined = evaluate.combine([f1, recall, precision])
     df = predictions.get_dataframe()
     predicted = df["Predicted Label"].to_list()
     references = df["Actual Label"].to_list()
     combined.add_batch(prediction=predicted, reference=references)
-    outputs =  combined.compute()
     return "Your metrics are as follows: \n" + outputs

 evaluation. I encountered this issue specifically while training [RoBERTa-base-DReiFT](https://huggingface.co/MarioBarbeque/RoBERTa-base-DReiFT) for multilabel \
 text classification of 805 labeled medical conditions based on drug reviews. \n
+This Space shows how one can instantiate these custom `evaluate.Metric`s, each with their own unique methodology for averaging across labels, before `combine`-ing them into a
+HF `evaluate.CombinedEvaluations` object. From here, we can easily compute each of the metrics simultaneously using `compute`.</p>
 """
+article = """<p style='text-align: center'>Check out the [original repo](https://github.com/johngrahamreynolds/FixedMetricsForHF) housing this code, and a quickly \
+trained [multilabel text classification model](https://github.com/johngrahamreynolds/RoBERTa-base-DReiFT/tree/main) that makes use of it during evaluation.</p>"""
+def evaluation(predictions, metrics) -> str:
+    metric_set = set(metrics["Metric"].to_list())
+    combined_list = []
+    if "f1" in metric_set:
+        f1 = FixedF1(average=metrics.loc[metrics["Metric"] == "f1"]["Averaging Type"][0])
+        combined_list.append(f1)
+    if "precision" in metric_set:
+        precision = FixedPrecision(average=metrics.loc[metrics["Metric"] == "precision"]["Averaging Type"][0])
+        combined_list.append(precision)
+    if "recall" in metric_set:
+        recall = FixedRecall(average=metrics.loc[metrics["Metric"] == "recall"]["Averaging Type"][0])
+        combined_list.append(recall)
+    combined = evaluate.combine(combined_list)
     df = predictions.get_dataframe()
     predicted = df["Predicted Label"].to_list()
     references = df["Actual Label"].to_list()
     combined.add_batch(prediction=predicted, reference=references)
+    outputs = combined.compute()
     return "Your metrics are as follows: \n" + outputs