John Graham Reynolds commited on
Commit
41d497e
1 Parent(s): bcbab79

update predicitions and add examples

Browse files
Files changed (1) hide show
  1. app.py +6 -4
app.py CHANGED
@@ -14,7 +14,7 @@ Check out the original, longstanding issue [here](https://github.com/huggingface
14
  `evaluate.combine()` multiple metrics related to multilabel text classification. Particularly, one cannot `combine` the `f1`, `precision`, and `recall` scores for \
15
  evaluation. I encountered this issue specifically while training [RoBERTa-base-DReiFT](https://huggingface.co/MarioBarbeque/RoBERTa-base-DReiFT) for multilabel \
16
  text classification of 805 labeled medical conditions based on drug reviews. The [following workaround](https://github.com/johngrahamreynolds/FixedMetricsForHF) was
17
- congifured. \n
18
 
19
  This Space shows how one can instantiate these custom `evaluate.Metric`s, each with their own unique methodology for averaging across labels, before `combine`-ing them into a
20
  HF `evaluate.CombinedEvaluations` object. From here, we can easily compute each of the metrics simultaneously using `compute`.</p>
@@ -43,7 +43,7 @@ def evaluation(predictions, metrics) -> str:
43
  predicted = [int(num) for num in predictions["Predicted Class Label"].to_list()]
44
  references = [int(num) for num in predictions["Actual Class Label"].to_list()]
45
 
46
- combined.add(prediction=predicted, reference=references)
47
  outputs = combined.compute()
48
 
49
  return "Your metrics are as follows: \n" + outputs
@@ -96,8 +96,10 @@ space = gr.Interface(
96
  description=description,
97
  article=article,
98
  examples=[
99
- [[[1,1],[1,0],[2,0],[1,2],[2,2]], [["f1", "weighted"], ["precision", "micro"], ["recall", "weighted"]]],
100
- # [[["precision", "micro"], ["recall", "weighted"], ["f1", "macro"]]],
 
 
101
  ]
102
  cache_examples=False
103
  ).launch()
 
14
  `evaluate.combine()` multiple metrics related to multilabel text classification. Particularly, one cannot `combine` the `f1`, `precision`, and `recall` scores for \
15
  evaluation. I encountered this issue specifically while training [RoBERTa-base-DReiFT](https://huggingface.co/MarioBarbeque/RoBERTa-base-DReiFT) for multilabel \
16
  text classification of 805 labeled medical conditions based on drug reviews. The [following workaround](https://github.com/johngrahamreynolds/FixedMetricsForHF) was
17
+ created to address this. \n
18
 
19
  This Space shows how one can instantiate these custom `evaluate.Metric`s, each with their own unique methodology for averaging across labels, before `combine`-ing them into a
20
  HF `evaluate.CombinedEvaluations` object. From here, we can easily compute each of the metrics simultaneously using `compute`.</p>
 
43
  predicted = [int(num) for num in predictions["Predicted Class Label"].to_list()]
44
  references = [int(num) for num in predictions["Actual Class Label"].to_list()]
45
 
46
+ combined.add_batch(predictions=predicted, references=references)
47
  outputs = combined.compute()
48
 
49
  return "Your metrics are as follows: \n" + outputs
 
96
  description=description,
97
  article=article,
98
  examples=[
99
+ [
100
+ [[1,1], [1,0], [2,0], [1,2], [2,2]],
101
+ [["f1", "weighted"], ["precision", "micro"], ["recall", "weighted"]]
102
+ ]
103
  ]
104
  cache_examples=False
105
  ).launch()