John Graham Reynolds
commited on
Commit
•
41d497e
1
Parent(s):
bcbab79
update predicitions and add examples
Browse files
app.py
CHANGED
@@ -14,7 +14,7 @@ Check out the original, longstanding issue [here](https://github.com/huggingface
|
|
14 |
`evaluate.combine()` multiple metrics related to multilabel text classification. Particularly, one cannot `combine` the `f1`, `precision`, and `recall` scores for \
|
15 |
evaluation. I encountered this issue specifically while training [RoBERTa-base-DReiFT](https://huggingface.co/MarioBarbeque/RoBERTa-base-DReiFT) for multilabel \
|
16 |
text classification of 805 labeled medical conditions based on drug reviews. The [following workaround](https://github.com/johngrahamreynolds/FixedMetricsForHF) was
|
17 |
-
|
18 |
|
19 |
This Space shows how one can instantiate these custom `evaluate.Metric`s, each with their own unique methodology for averaging across labels, before `combine`-ing them into a
|
20 |
HF `evaluate.CombinedEvaluations` object. From here, we can easily compute each of the metrics simultaneously using `compute`.</p>
|
@@ -43,7 +43,7 @@ def evaluation(predictions, metrics) -> str:
|
|
43 |
predicted = [int(num) for num in predictions["Predicted Class Label"].to_list()]
|
44 |
references = [int(num) for num in predictions["Actual Class Label"].to_list()]
|
45 |
|
46 |
-
combined.
|
47 |
outputs = combined.compute()
|
48 |
|
49 |
return "Your metrics are as follows: \n" + outputs
|
@@ -96,8 +96,10 @@ space = gr.Interface(
|
|
96 |
description=description,
|
97 |
article=article,
|
98 |
examples=[
|
99 |
-
[
|
100 |
-
|
|
|
|
|
101 |
]
|
102 |
cache_examples=False
|
103 |
).launch()
|
|
|
14 |
`evaluate.combine()` multiple metrics related to multilabel text classification. Particularly, one cannot `combine` the `f1`, `precision`, and `recall` scores for \
|
15 |
evaluation. I encountered this issue specifically while training [RoBERTa-base-DReiFT](https://huggingface.co/MarioBarbeque/RoBERTa-base-DReiFT) for multilabel \
|
16 |
text classification of 805 labeled medical conditions based on drug reviews. The [following workaround](https://github.com/johngrahamreynolds/FixedMetricsForHF) was
|
17 |
+
created to address this. \n
|
18 |
|
19 |
This Space shows how one can instantiate these custom `evaluate.Metric`s, each with their own unique methodology for averaging across labels, before `combine`-ing them into a
|
20 |
HF `evaluate.CombinedEvaluations` object. From here, we can easily compute each of the metrics simultaneously using `compute`.</p>
|
|
|
43 |
predicted = [int(num) for num in predictions["Predicted Class Label"].to_list()]
|
44 |
references = [int(num) for num in predictions["Actual Class Label"].to_list()]
|
45 |
|
46 |
+
combined.add_batch(predictions=predicted, references=references)
|
47 |
outputs = combined.compute()
|
48 |
|
49 |
return "Your metrics are as follows: \n" + outputs
|
|
|
96 |
description=description,
|
97 |
article=article,
|
98 |
examples=[
|
99 |
+
[
|
100 |
+
[[1,1], [1,0], [2,0], [1,2], [2,2]],
|
101 |
+
[["f1", "weighted"], ["precision", "micro"], ["recall", "weighted"]]
|
102 |
+
]
|
103 |
]
|
104 |
cache_examples=False
|
105 |
).launch()
|