---
title: TREC Eval
emoji: 🤗 
colorFrom: blue
colorTo: red
sdk: gradio
sdk_version: 3.19.1
app_file: app.py
pinned: false
tags:
- evaluate
- metric
description: >-
  The TREC Eval metric combines a number of information retrieval metrics such as precision and nDCG. It is used to score rankings of retrieved documents with reference values.
---

# Metric Card for TREC Eval

## Metric Description

The TREC Eval metric combines a number of information retrieval metrics such as precision and normalized Discounted Cumulative Gain (nDCG). It is used to score rankings of retrieved documents with reference values.

## How to Use
```Python
from evaluate import load
trec_eval = load("trec_eval")
results = trec_eval.compute(predictions=[run], references=[qrel])
```

### Inputs
- **predictions** *(dict): a single retrieval run.*
    - **query** *(int): Query ID.*
    - **q0** *(str): Literal `"q0"`.*
    - **docid** *(str): Document ID.*
    - **rank** *(int): Rank of document.*
    - **score** *(float): Score of document.*
    - **system** *(str): Tag for current run.*
- **references** *(dict): a single qrel.*
    - **query** *(int): Query ID.*
    - **q0** *(str): Literal `"q0"`.*
    - **docid** *(str): Document ID.*
    - **rel** *(int): Relevance of document.*

### Output Values
- **runid** *(str): Run name.*  
- **num_ret** *(int): Number of retrieved documents.*  
- **num_rel** *(int): Number of relevant documents.*  
- **num_rel_ret** *(int): Number of retrieved relevant documents.*  
- **num_q** *(int): Number of queries.*  
- **map** *(float): Mean average precision.*
- **gm_map** *(float): geometric mean average precision.*  
- **bpref** *(float): binary preference score.*  
- **Rprec** *(float): precision@R, where R is number of relevant documents.*  
- **recip_rank** *(float): reciprocal rank*  
- **P@k** *(float): precision@k (k in [5, 10, 15, 20, 30, 100, 200, 500, 1000]).*  
- **NDCG@k** *(float): nDCG@k (k in [5, 10, 15, 20, 30, 100, 200, 500, 1000]).*  

### Examples

A minimal example of looks as follows:
```Python
qrel = {
    "query": [0],
    "q0": ["q0"],
    "docid": ["doc_1"],
    "rel": [2]
}
run = {
    "query": [0, 0],
    "q0": ["q0", "q0"],
    "docid": ["doc_2", "doc_1"],
    "rank": [0, 1],
    "score": [1.5, 1.2],
    "system": ["test", "test"]
}

trec_eval = evaluate.load("trec_eval")
results = trec_eval.compute(references=[qrel], predictions=[run])
results["P@5"]
0.2
```

A more realistic use case with an examples from [`trectools`](https://github.com/joaopalotti/trectools):

```python
qrel = pd.read_csv("robust03_qrels.txt", sep="\s+", names=["query", "q0", "docid", "rel"])
qrel["q0"] = qrel["q0"].astype(str)
qrel = qrel.to_dict(orient="list")

run = pd.read_csv("input.InexpC2", sep="\s+", names=["query", "q0", "docid", "rank", "score", "system"])
run = run.to_dict(orient="list")

trec_eval = evaluate.load("trec_eval")
result = trec_eval.compute(run=[run], qrel=[qrel])
```

```python
result

{'runid': 'InexpC2',
 'num_ret': 100000,
 'num_rel': 6074,
 'num_rel_ret': 3198,
 'num_q': 100,
 'map': 0.22485930431817494,
 'gm_map': 0.10411523825735523,
 'bpref': 0.217511695914079,
 'Rprec': 0.2502547201167236,
 'recip_rank': 0.6646545943335417,
 'P@5': 0.44,
 'P@10': 0.37,
 'P@15': 0.34600000000000003,
 'P@20': 0.30999999999999994,
 'P@30': 0.2563333333333333,
 'P@100': 0.1428,
 'P@200': 0.09510000000000002,
 'P@500': 0.05242,
 'P@1000': 0.03198,
 'NDCG@5': 0.4101480395089769,
 'NDCG@10': 0.3806761417784469,
 'NDCG@15': 0.37819463408955706,
 'NDCG@20': 0.3686080836061317,
 'NDCG@30': 0.352474353427451,
 'NDCG@100': 0.3778329431025776,
 'NDCG@200': 0.4119129817248979,
 'NDCG@500': 0.4585354576461375,
 'NDCG@1000': 0.49092149290805653}
```

## Limitations and Bias
The `trec_eval` metric requires the inputs to be in the TREC run and qrel formats for predictions and references.


## Citation

```bibtex
@inproceedings{palotti2019,
 author = {Palotti, Joao and Scells, Harrisen and Zuccon, Guido},
 title = {TrecTools: an open-source Python library for Information Retrieval practitioners involved in TREC-like campaigns},
 series = {SIGIR'19},
 year = {2019},
 location = {Paris, France},
 publisher = {ACM}
} 
```

## Further References

- Homepage: https://github.com/joaopalotti/trectools