Spaces:
Runtime error
Runtime error
File size: 4,768 Bytes
82bab0f 318c91b 82bab0f 220984d 82bab0f 318c91b a9b8c5a 318c91b a9b8c5a fd58845 a9b8c5a 82bab0f 318c91b fd58845 318c91b fd58845 318c91b fd58845 318c91b fd58845 318c91b fd58845 318c91b fd58845 318c91b |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 |
---
title: nDCG
emoji: 👁
colorFrom: red
colorTo: blue
sdk: gradio
sdk_version: 3.9.1
app_file: app.py
pinned: false
license: mit
tags:
- evaluate
- metric
- ranking
description: >-
The Discounted Cumulative Gain is a measure of ranking quality.
It is used to evaluate Information Retrieval Systems under the following 2 assumptions:
1. Highly relevant documents/Labels are more useful when appearing earlier in the results
2. Documents/Labels are relevant to different degrees
It is defined as the Sum over all relevances of the retrieved documents reduced logarithmically proportional to
the position in which they were retrieved.
The Normalized DCG (nDCG) divides the resulting value by the best possible value to get a value between
0 and 1 s.t. a perfect retrieval achieves a nDCG of 1.
---
# Metric Card for nDCG
## Metric Description
The Discounted Cumulative Gain is a measure of ranking quality.
It is used to evaluate Information Retrieval Systems under the 2 assumptions:
1. Highly relevant documents/Labels are more useful when appearing earlier in the results
2. Documents/Labels are relevant to different degrees
It is defined as the sum over all relevances of the retrieved documents reduced logarithmically proportional to
the position in which they were retrieved.
The Normalized DCG (nDCG) divides the resulting value by the optimal value that can be achieved to get a value between
0 and 1 s.t. a perfect retrieval achieves a nDCG of 1.0
## How to Use
At minimum, this metric takes as input two `list`s of `list`s, each containing `float`s: predictions and references.
```python
import evaluate
nDCG_metric = evaluate.load('JP-SystemsX/nDCG')
results = nDCG_metric.compute(references=[[0, 1]], predictions=[[0, 1]])
print(results)
["{'nDCG@2': 1.0}"]
```
### Inputs:
**references** (`list` of `float`): True relevance
**predictions** (`list` of `float`): Either predicted relevance, probability estimates or confidence values
**k** (`int`): If set to a value only the k highest scores in the ranking will be considered, else considers all outputs.
Defaults to None.
**sample_weight** (`list` of `float`): Sample weights Defaults to None.
**ignore_ties** (`boolean`): If set to true, assumes that there are no ties (this is likely if predictions are continuous)
for efficiency gains. Defaults to False.
### Output:
**normalized_discounted_cumulative_gain** (`float`): The averaged nDCG scores for all samples.
Minimum possible value is 0.0 Maximum possible value is 1.0
Output Example(s):
```python
{'nDCG@5': 1.0}
{'nDCG': 0.876}
```
This metric outputs a dictionary, containing the nDCG score
### Examples:
Example 1-A simple example
>>> nDCG_metric = evaluate.load("JP-SystemsX/nDCG")
>>> results = nDCG_metric.compute(references=[[10, 0, 0, 1, 5]], predictions=[[.1, .2, .3, 4, 70]])
>>> print(results)
{'nDCG': 0.6956940443813076}
Example 2-The same as Example 1, except with k set to 3.
>>> nDCG_metric = evaluate.load("JP-SystemsX/nDCG")
>>> results = nDCG_metric.compute(references=[[10, 0, 0, 1, 5]], predictions=[[.1, .2, .3, 4, 70]], k=3)
>>> print(results)
{'nDCG@3': 0.4123818817534531}
Example 3-There is only one relevant label, but there is a tie and the model can't decide which one is the one.
>>> accuracy_metric = evaluate.load("accuracy")
>>> results = nDCG_metric.compute(references=[[1, 0, 0, 0, 0]], predictions=[[1, 1, 0, 0, 0]], k=1)
>>> print(results)
{'nDCG@1': 0.5}
>>> #That is it calculates both and returns the average of both
Example 4-The Same as 3, except ignore_ties is set to True.
>>> accuracy_metric = evaluate.load("accuracy")
>>> results = nDCG_metric.compute(references=[[1, 0, 0, 0, 0]], predictions=[[1, 1, 0, 0, 0]], k=1, ignore_ties=True)
>>> print(results)
{'nDCG@1': 0.0}
>>> # Alternative Result: {'nDCG@1': 1.0}
>>> # That is it chooses one of the 2 candidates and calculates the score only for this one
>>> # That means the score may vary depending on which one was chosen
## Citation(s)
```bibtex
@article{scikit-learn,
title={Scikit-learn: Machine Learning in {P}ython},
author={Pedregosa, F. and Varoquaux, G. and Gramfort, A. and Michel, V.
and Thirion, B. and Grisel, O. and Blondel, M. and Prettenhofer, P.
and Weiss, R. and Dubourg, V. and Vanderplas, J. and Passos, A. and
Cournapeau, D. and Brucher, M. and Perrot, M. and Duchesnay, E.},
journal={Journal of Machine Learning Research},
volume={12},
pages={2825--2830},
year={2011}
} |