Spaces:

evaluate-metric
/

accuracy

Running

App Files Files Community

accuracy / README.md

lvwerra HF staff

Update Space (evaluate main: 828c6327)

7332944 almost 2 years ago

preview code

raw history blame

No virus

3.67 kB

	---
	title: Accuracy
	emoji: 🤗
	colorFrom: blue
	colorTo: red
	sdk: gradio
	sdk_version: 3.0.2
	app_file: app.py
	pinned: false
	tags:
	- evaluate
	- metric
	---

	# Metric Card for Accuracy


	## Metric Description

	Accuracy is the proportion of correct predictions among the total number of cases processed. It can be computed with:
	Accuracy = (TP + TN) / (TP + TN + FP + FN)
	Where:
	TP: True positive
	TN: True negative
	FP: False positive
	FN: False negative


	## How to Use

	At minimum, this metric requires predictions and references as inputs.

	```python
	>>> accuracy_metric = evaluate.load("accuracy")
	>>> results = accuracy_metric.compute(references=[0, 1], predictions=[0, 1])
	>>> print(results)
	{'accuracy': 1.0}
	```


	### Inputs
	- predictions (`list` of `int`): Predicted labels.
	- references (`list` of `int`): Ground truth labels.
	- normalize (`boolean`): If set to False, returns the number of correctly classified samples. Otherwise, returns the fraction of correctly classified samples. Defaults to True.
	- sample_weight (`list` of `float`): Sample weights Defaults to None.


	### Output Values
	- accuracy(`float` or `int`): Accuracy score. Minimum possible value is 0. Maximum possible value is 1.0, or the number of examples input, if `normalize` is set to `True`.. A higher score means higher accuracy.

	Output Example(s):
	```python
	{'accuracy': 1.0}
	```

	This metric outputs a dictionary, containing the accuracy score.


	#### Values from Popular Papers

	Top-1 or top-5 accuracy is often used to report performance on supervised classification tasks such as image classification (e.g. on [ImageNet](https://paperswithcode.com/sota/image-classification-on-imagenet)) or sentiment analysis (e.g. on [IMDB](https://paperswithcode.com/sota/text-classification-on-imdb)).


	### Examples

	Example 1-A simple example
	```python
	>>> accuracy_metric = evaluate.load("accuracy")
	>>> results = accuracy_metric.compute(references=[0, 1, 2, 0, 1, 2], predictions=[0, 1, 1, 2, 1, 0])
	>>> print(results)
	{'accuracy': 0.5}
	```

	Example 2-The same as Example 1, except with `normalize` set to `False`.
	```python
	>>> accuracy_metric = evaluate.load("accuracy")
	>>> results = accuracy_metric.compute(references=[0, 1, 2, 0, 1, 2], predictions=[0, 1, 1, 2, 1, 0], normalize=False)
	>>> print(results)
	{'accuracy': 3.0}
	```

	Example 3-The same as Example 1, except with `sample_weight` set.
	```python
	>>> accuracy_metric = evaluate.load("accuracy")
	>>> results = accuracy_metric.compute(references=[0, 1, 2, 0, 1, 2], predictions=[0, 1, 1, 2, 1, 0], sample_weight=[0.5, 2, 0.7, 0.5, 9, 0.4])
	>>> print(results)
	{'accuracy': 0.8778625954198473}
	```


	## Limitations and Bias
	This metric can be easily misleading, especially in the case of unbalanced classes. For example, a high accuracy might be because a model is doing well, but if the data is unbalanced, it might also be because the model is only accurately labeling the high-frequency class. In such cases, a more detailed analysis of the model's behavior, or the use of a different metric entirely, is necessary to determine how well the model is actually performing.


	## Citation(s)
	```bibtex
	@article{scikit-learn,
	title={Scikit-learn: Machine Learning in {P}ython},
	author={Pedregosa, F. and Varoquaux, G. and Gramfort, A. and Michel, V.
	and Thirion, B. and Grisel, O. and Blondel, M. and Prettenhofer, P.
	and Weiss, R. and Dubourg, V. and Vanderplas, J. and Passos, A. and
	Cournapeau, D. and Brucher, M. and Perrot, M. and Duchesnay, E.},
	journal={Journal of Machine Learning Research},
	volume={12},
	pages={2825--2830},
	year={2011}
	}
	```


	## Further References