lvwerra HF staff commited on
Commit
d1d9d67
1 Parent(s): e7ef37f

Update Space (evaluate main: 828c6327)

Browse files
Files changed (4) hide show
  1. README.md +108 -5
  2. app.py +6 -0
  3. pearsonr.py +107 -0
  4. requirements.txt +4 -0
README.md CHANGED
@@ -1,12 +1,115 @@
1
  ---
2
- title: Pearsonr
3
- emoji: 💩
4
- colorFrom: green
5
- colorTo: yellow
6
  sdk: gradio
7
  sdk_version: 3.0.2
8
  app_file: app.py
9
  pinned: false
 
 
 
10
  ---
11
 
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces#reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: Pearson Correlation Coefficient
3
+ emoji: 🤗
4
+ colorFrom: blue
5
+ colorTo: red
6
  sdk: gradio
7
  sdk_version: 3.0.2
8
  app_file: app.py
9
  pinned: false
10
+ tags:
11
+ - evaluate
12
+ - metric
13
  ---
14
 
15
+ # Metric Card for Pearson Correlation Coefficient (pearsonr)
16
+
17
+
18
+ ## Metric Description
19
+
20
+ Pearson correlation coefficient and p-value for testing non-correlation.
21
+ The Pearson correlation coefficient measures the linear relationship between two datasets. The calculation of the p-value relies on the assumption that each dataset is normally distributed. Like other correlation coefficients, this one varies between -1 and +1 with 0 implying no correlation. Correlations of -1 or +1 imply an exact linear relationship. Positive correlations imply that as x increases, so does y. Negative correlations imply that as x increases, y decreases.
22
+ The p-value roughly indicates the probability of an uncorrelated system producing datasets that have a Pearson correlation at least as extreme as the one computed from these datasets.
23
+
24
+
25
+ ## How to Use
26
+
27
+ This metric takes a list of predictions and a list of references as input
28
+
29
+ ```python
30
+ >>> pearsonr_metric = evaluate.load("pearsonr")
31
+ >>> results = pearsonr_metric.compute(predictions=[10, 9, 2.5, 6, 4], references=[1, 2, 3, 4, 5])
32
+ >>> print(round(results['pearsonr']), 2)
33
+ ['-0.74']
34
+ ```
35
+
36
+
37
+ ### Inputs
38
+ - **predictions** (`list` of `int`): Predicted class labels, as returned by a model.
39
+ - **references** (`list` of `int`): Ground truth labels.
40
+ - **return_pvalue** (`boolean`): If `True`, returns the p-value, along with the correlation coefficient. If `False`, returns only the correlation coefficient. Defaults to `False`.
41
+
42
+
43
+ ### Output Values
44
+ - **pearsonr**(`float`): Pearson correlation coefficient. Minimum possible value is -1. Maximum possible value is 1. Values of 1 and -1 indicate exact linear positive and negative relationships, respectively. A value of 0 implies no correlation.
45
+ - **p-value**(`float`): P-value, which roughly indicates the probability of an The p-value roughly indicates the probability of an uncorrelated system producing datasets that have a Pearson correlation at least as extreme as the one computed from these datasets. Minimum possible value is 0. Maximum possible value is 1. Higher values indicate higher probabilities.
46
+
47
+ Like other correlation coefficients, this one varies between -1 and +1 with 0 implying no correlation. Correlations of -1 or +1 imply an exact linear relationship. Positive correlations imply that as x increases, so does y. Negative correlations imply that as x increases, y decreases.
48
+
49
+ Output Example(s):
50
+ ```python
51
+ {'pearsonr': -0.7}
52
+ ```
53
+ ```python
54
+ {'p-value': 0.15}
55
+ ```
56
+
57
+ #### Values from Popular Papers
58
+
59
+ ### Examples
60
+
61
+ Example 1-A simple example using only predictions and references.
62
+ ```python
63
+ >>> pearsonr_metric = evaluate.load("pearsonr")
64
+ >>> results = pearsonr_metric.compute(predictions=[10, 9, 2.5, 6, 4], references=[1, 2, 3, 4, 5])
65
+ >>> print(round(results['pearsonr'], 2))
66
+ -0.74
67
+ ```
68
+
69
+ Example 2-The same as Example 1, but that also returns the `p-value`.
70
+ ```python
71
+ >>> pearsonr_metric = evaluate.load("pearsonr")
72
+ >>> results = pearsonr_metric.compute(predictions=[10, 9, 2.5, 6, 4], references=[1, 2, 3, 4, 5], return_pvalue=True)
73
+ >>> print(sorted(list(results.keys())))
74
+ ['p-value', 'pearsonr']
75
+ >>> print(round(results['pearsonr'], 2))
76
+ -0.74
77
+ >>> print(round(results['p-value'], 2))
78
+ 0.15
79
+ ```
80
+
81
+
82
+ ## Limitations and Bias
83
+
84
+ As stated above, the calculation of the p-value relies on the assumption that each data set is normally distributed. This is not always the case, so verifying the true distribution of datasets is recommended.
85
+
86
+
87
+ ## Citation(s)
88
+ ```bibtex
89
+ @article{2020SciPy-NMeth,
90
+ author = {Virtanen, Pauli and Gommers, Ralf and Oliphant, Travis E. and
91
+ Haberland, Matt and Reddy, Tyler and Cournapeau, David and
92
+ Burovski, Evgeni and Peterson, Pearu and Weckesser, Warren and
93
+ Bright, Jonathan and {van der Walt}, St{\'e}fan J. and
94
+ Brett, Matthew and Wilson, Joshua and Millman, K. Jarrod and
95
+ Mayorov, Nikolay and Nelson, Andrew R. J. and Jones, Eric and
96
+ Kern, Robert and Larson, Eric and Carey, C J and
97
+ Polat, {\.I}lhan and Feng, Yu and Moore, Eric W. and
98
+ {VanderPlas}, Jake and Laxalde, Denis and Perktold, Josef and
99
+ Cimrman, Robert and Henriksen, Ian and Quintero, E. A. and
100
+ Harris, Charles R. and Archibald, Anne M. and
101
+ Ribeiro, Ant{\^o}nio H. and Pedregosa, Fabian and
102
+ {van Mulbregt}, Paul and {SciPy 1.0 Contributors}},
103
+ title = {{{SciPy} 1.0: Fundamental Algorithms for Scientific
104
+ Computing in Python}},
105
+ journal = {Nature Methods},
106
+ year = {2020},
107
+ volume = {17},
108
+ pages = {261--272},
109
+ adsurl = {https://rdcu.be/b08Wh},
110
+ doi = {10.1038/s41592-019-0686-2},
111
+ }
112
+ ```
113
+
114
+
115
+ ## Further References
app.py ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ import evaluate
2
+ from evaluate.utils import launch_gradio_widget
3
+
4
+
5
+ module = evaluate.load("pearsonr")
6
+ launch_gradio_widget(module)
pearsonr.py ADDED
@@ -0,0 +1,107 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright 2021 The HuggingFace Datasets Authors and the current dataset script contributor.
2
+ #
3
+ # Licensed under the Apache License, Version 2.0 (the "License");
4
+ # you may not use this file except in compliance with the License.
5
+ # You may obtain a copy of the License at
6
+ #
7
+ # http://www.apache.org/licenses/LICENSE-2.0
8
+ #
9
+ # Unless required by applicable law or agreed to in writing, software
10
+ # distributed under the License is distributed on an "AS IS" BASIS,
11
+ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12
+ # See the License for the specific language governing permissions and
13
+ # limitations under the License.
14
+ """Pearson correlation coefficient metric."""
15
+
16
+ import datasets
17
+ from scipy.stats import pearsonr
18
+
19
+ import evaluate
20
+
21
+
22
+ _DESCRIPTION = """
23
+ Pearson correlation coefficient and p-value for testing non-correlation.
24
+ The Pearson correlation coefficient measures the linear relationship between two datasets. The calculation of the p-value relies on the assumption that each dataset is normally distributed. Like other correlation coefficients, this one varies between -1 and +1 with 0 implying no correlation. Correlations of -1 or +1 imply an exact linear relationship. Positive correlations imply that as x increases, so does y. Negative correlations imply that as x increases, y decreases.
25
+ The p-value roughly indicates the probability of an uncorrelated system producing datasets that have a Pearson correlation at least as extreme as the one computed from these datasets.
26
+ """
27
+
28
+
29
+ _KWARGS_DESCRIPTION = """
30
+ Args:
31
+ predictions (`list` of `int`): Predicted class labels, as returned by a model.
32
+ references (`list` of `int`): Ground truth labels.
33
+ return_pvalue (`boolean`): If `True`, returns the p-value, along with the correlation coefficient. If `False`, returns only the correlation coefficient. Defaults to `False`.
34
+
35
+ Returns:
36
+ pearsonr (`float`): Pearson correlation coefficient. Minimum possible value is -1. Maximum possible value is 1. Values of 1 and -1 indicate exact linear positive and negative relationships, respectively. A value of 0 implies no correlation.
37
+ p-value (`float`): P-value, which roughly indicates the probability of an The p-value roughly indicates the probability of an uncorrelated system producing datasets that have a Pearson correlation at least as extreme as the one computed from these datasets. Minimum possible value is 0. Maximum possible value is 1. Higher values indicate higher probabilities.
38
+
39
+ Examples:
40
+
41
+ Example 1-A simple example using only predictions and references.
42
+ >>> pearsonr_metric = evaluate.load("pearsonr")
43
+ >>> results = pearsonr_metric.compute(predictions=[10, 9, 2.5, 6, 4], references=[1, 2, 3, 4, 5])
44
+ >>> print(round(results['pearsonr'], 2))
45
+ -0.74
46
+
47
+ Example 2-The same as Example 1, but that also returns the `p-value`.
48
+ >>> pearsonr_metric = evaluate.load("pearsonr")
49
+ >>> results = pearsonr_metric.compute(predictions=[10, 9, 2.5, 6, 4], references=[1, 2, 3, 4, 5], return_pvalue=True)
50
+ >>> print(sorted(list(results.keys())))
51
+ ['p-value', 'pearsonr']
52
+ >>> print(round(results['pearsonr'], 2))
53
+ -0.74
54
+ >>> print(round(results['p-value'], 2))
55
+ 0.15
56
+ """
57
+
58
+
59
+ _CITATION = """
60
+ @article{2020SciPy-NMeth,
61
+ author = {Virtanen, Pauli and Gommers, Ralf and Oliphant, Travis E. and
62
+ Haberland, Matt and Reddy, Tyler and Cournapeau, David and
63
+ Burovski, Evgeni and Peterson, Pearu and Weckesser, Warren and
64
+ Bright, Jonathan and {van der Walt}, St{\'e}fan J. and
65
+ Brett, Matthew and Wilson, Joshua and Millman, K. Jarrod and
66
+ Mayorov, Nikolay and Nelson, Andrew R. J. and Jones, Eric and
67
+ Kern, Robert and Larson, Eric and Carey, C J and
68
+ Polat, Ilhan and Feng, Yu and Moore, Eric W. and
69
+ {VanderPlas}, Jake and Laxalde, Denis and Perktold, Josef and
70
+ Cimrman, Robert and Henriksen, Ian and Quintero, E. A. and
71
+ Harris, Charles R. and Archibald, Anne M. and
72
+ Ribeiro, Antonio H. and Pedregosa, Fabian and
73
+ {van Mulbregt}, Paul and {SciPy 1.0 Contributors}},
74
+ title = {{{SciPy} 1.0: Fundamental Algorithms for Scientific
75
+ Computing in Python}},
76
+ journal = {Nature Methods},
77
+ year = {2020},
78
+ volume = {17},
79
+ pages = {261--272},
80
+ adsurl = {https://rdcu.be/b08Wh},
81
+ doi = {10.1038/s41592-019-0686-2},
82
+ }
83
+ """
84
+
85
+
86
+ @evaluate.utils.file_utils.add_start_docstrings(_DESCRIPTION, _KWARGS_DESCRIPTION)
87
+ class Pearsonr(evaluate.EvaluationModule):
88
+ def _info(self):
89
+ return evaluate.EvaluationModuleInfo(
90
+ description=_DESCRIPTION,
91
+ citation=_CITATION,
92
+ inputs_description=_KWARGS_DESCRIPTION,
93
+ features=datasets.Features(
94
+ {
95
+ "predictions": datasets.Value("float"),
96
+ "references": datasets.Value("float"),
97
+ }
98
+ ),
99
+ reference_urls=["https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.pearsonr.html"],
100
+ )
101
+
102
+ def _compute(self, predictions, references, return_pvalue=False):
103
+ if return_pvalue:
104
+ results = pearsonr(references, predictions)
105
+ return {"pearsonr": results[0], "p-value": results[1]}
106
+ else:
107
+ return {"pearsonr": float(pearsonr(references, predictions)[0])}
requirements.txt ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ # TODO: fix github to release
2
+ git+https://github.com/huggingface/evaluate.git@b6e6ed7f3e6844b297bff1b43a1b4be0709b9671
3
+ datasets~=2.0
4
+ scipy