4.35 GB
17,662 files
Updated 1 day ago
Ctrl+K
| Name | Size | Uploaded | Xet hash |
|---|---|---|---|
| README.md | 1.62 kB xet | 31812dbb | |
| run_all_ilps.sh | 4.25 kB xet | 8044195b | |
| run_all_lumi.sh | 4.87 kB xet | 7fd4c6ab | |
| run_crux-mds-duc04.sh | 6.62 kB xet | 83dcb098 | |
| run_neuclir.sh | 6.25 kB xet | a634045f | |
| run_trec-dl-2019.sh | 4.5 kB xet | aecdadaa | |
| run_trec-dl-2020.sh | 4.05 kB xet | 8e0ef871 |
A Python toy example for using AutoReranker
Relevance-based IR Data
To accomondate to the standard input format of AutoReranker, the example data is organized as three dictionaries: run, queries, and corpus.
Below is an example of how to structure these dictionaries.
run = {
"q1": {"d2": 0.95, "d1*": 0.70, "d6": 0.25},
"q2": {"d4*": 0.88, "d3": 0.73, "d7": 0.20},
"q3": {"d5*": 0.91, "d8": 0.60, "d9*": 0.40}
}
queries = {
"q1": "What city is the capital of France?",
"q2": "Who painted the Mona Lisa?",
"q3": "√144 equals?"
}
corpus = {
"d1*": "Paris is the capital of France.",
"d2": "London is the capital of the UK.",
"d3": "Vincent van Gogh painted 'The Starry Night'.",
"d4*": "The painter of the Mona Lisa was Leonardo da Vinci.",
"d5*": "The square root of 144 is 12.",
"d6": "Berlin is the capital of Germany.",
"d7": "Pablo Picasso painted 'Guernica'.",
"d8": "The cube root of 27 is 3.",
"d9*": "12 is the positive solution to √144."
}
qrel = {
"q1": {"d1*": 1},
"q2": {"d4*": 1},
"q3": {"d5*": 1, "d9*": 1}
}
Initialize a reranker and rerank
Once the data is structured, you can initialize the ModularReranker with the prebuilt method and use it to rerank the documents based on the queries.
We use ir_measures library to evaluate the reranked results
reranker = ModularReranker.from_prebuilt('rankgpt', 'Qwen/Qwen2.5-7B-Instruct')
reranked_result = reranker.rerank(run=run, queries=queries, corpus=corpus)
print(ir_measures.calc_aggregate([RR@5], qrel, reranked_result))
- Total size
- 4.35 GB
- Files
- 17,662
- Last updated
- Jun 15
- Pre-warmed CDN
- US EU US EU