File size: 892 Bytes
4cadd3d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
You can use transformer library and load model for conditional generation and expect those tokens or use monoT5 implementation from BEIR.

prompt = `Query: {query} Document: {document} Relevant:`

Model returns tokens if relevant or not:
``` token_false='▁fałsz', token_true='▁prawda'```


MonoT5 implementation is included in BEIR benchmark(https://github.com/beir-cellar/beir):
```
from beir.reranking.models import MonoT5
from beir.reranking import Rerank

queries = YOUR_QUERIES
corpus = YOUR_CORPUS
queries = {query['id'] : query['text'] for query in queries}
corpus = {doc['id']: {'title': doc['title'] , 'text': doc['text']} for doc in corpus}


cross_encoder_model = MonoT5(model_path, use_amp=False, token_false='▁fałsz', token_true='▁prawda')
reranker = Rerank(cross_encoder_model, batch_size=100)

rerank_results = reranker.rerank(corpus, queries, results, top_k=100)
```