liuqi6777 commited on
Commit
c7ba4e1
1 Parent(s): 059e114

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +55 -4
README.md CHANGED
@@ -110,6 +110,30 @@ print(query_vectors)
110
 
111
  Complete working Colab Notebook is [here](https://colab.research.google.com/drive/1-5WGEYPSBNBg-Z0bGFysyvckFuM8imrg)
112
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
113
  ## Evaluation Results
114
 
115
  **TL;DR:** Our Jina-ColBERT achieves the competitive retrieval performance with [ColBERTv2](https://huggingface.co/colbert-ir/colbertv2.0) on all benchmarks, and outperforms ColBERTv2 on datasets in where documents have longer context length.
@@ -157,14 +181,41 @@ We also evaluate the zero-shot performance on datasets where documents have long
157
  | Jina-ColBERT-v1 | 8192 | 8192 | 83.7 |
158
  | Jina-embeddings-v2-base-en | 8192 | 8192 | **85.4** |
159
 
160
- \* denotes that we truncate the context length to the length of 512 for document but the query length is still 512.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
161
 
162
- **To summarize, Jina-ColBERT achieves the comparable performance with ColBERTv2 on all benchmarks, and outperforms ColBERTv2 on datasets in where documents have longer context length.**
163
 
164
  ## Plans
165
 
166
- - We will evaluate the performance of Jina-ColBERT as a reranker in a retrieval pipeline, and add the usage examples.
167
- - We are planning to improve the performance of Jina-ColBERT by fine-tuning on more datasets in the future.
168
 
169
  ## Other Models
170
 
 
110
 
111
  Complete working Colab Notebook is [here](https://colab.research.google.com/drive/1-5WGEYPSBNBg-Z0bGFysyvckFuM8imrg)
112
 
113
+ ### Reranking Using ColBERT
114
+
115
+ ```python
116
+ from colbert.modeling.checkpoint import Checkpoint
117
+ from colbert.infra import ColBERTConfig
118
+
119
+ query = ["How to use ColBERT for indexing long documents?"]
120
+ documents = [
121
+ "ColBERT is an efficient and effective passage retrieval model.",
122
+ "Jina-ColBERT is a ColBERT-style model but based on JinaBERT so it can support both 8k context length.",
123
+ "JinaBERT is a BERT architecture that supports the symmetric bidirectional variant of ALiBi to allow longer sequence length.",
124
+ "Jina-ColBERT model is trained on MSMARCO passage ranking dataset, following a very similar training procedure with ColBERTv2.",
125
+ ]
126
+
127
+ config = ColBERTConfig(query_maxlen=32, doc_maxlen=512)
128
+ ckpt = Checkpoint(args.reranker, colbert_config=colbert_config)
129
+ Q = ckpt.queryFromText([all_queries[i]])
130
+ D = ckpt.docFromText(all_passages, bsize=32)[0]
131
+ D_mask = torch.ones(D.shape[:2], dtype=torch.long)
132
+ scores = colbert_score(Q, D, D_mask).flatten().cpu().numpy().tolist()
133
+ ranking = numpy.argsort(scores)[::-1]
134
+ print(ranking)
135
+ ```
136
+
137
  ## Evaluation Results
138
 
139
  **TL;DR:** Our Jina-ColBERT achieves the competitive retrieval performance with [ColBERTv2](https://huggingface.co/colbert-ir/colbertv2.0) on all benchmarks, and outperforms ColBERTv2 on datasets in where documents have longer context length.
 
181
  | Jina-ColBERT-v1 | 8192 | 8192 | 83.7 |
182
  | Jina-embeddings-v2-base-en | 8192 | 8192 | **85.4** |
183
 
184
+ \* denotes that we truncate the context length to 512 for documents. The context length of queries is all 512.
185
+
186
+ **To summarize, Jina-ColBERT achieves the comparable retrieval performance with ColBERTv2 on all benchmarks, and outperforms ColBERTv2 on datasets in where documents have longer context length.**
187
+
188
+ ### Reranking Performance
189
+
190
+ We evaluate the reranking performance of ColBERTv2 and Jina-ColBERT on BEIR. We use BM25 as the first-stage retrieval model. The full evaluation code can be found in [this repo](https://github.com/liuqi6777/eval_reranker).
191
+
192
+ In summary, Jina-ColBERT outperforms ColBERTv2, even achieving comparable performance with some cross-encoder.
193
+
194
+ The best model, jina-reranker, will be open-sourced soon!
195
+
196
+ |BM25|ColBERTv2|Jina-ColBERT|MiniLM-L-6-v2|BGE-reranker-base-v1|BGE-reranker-large-v1|Jina-reranker-base-v1|
197
+ | --- | :---: | :---: | :---: | :---: | :---: | :---: |
198
+ Arguana |29.99|33.42|33.95|30.67|23.26|25.42|42.59|
199
+ Climate-Fever |16.51|20.66|21.87|24.70|31.60|31.98|25.49|
200
+ DBPedia |31.80|42.16|41.43|43.90|41.56|43.79|43.68|
201
+ FEVER |65.13|81.07|83.49|80.77|87.07|89.11|86.10|
202
+ FiQA |23.61|35.60|36.68|34.87|33.17|37.70|41.38|
203
+ HotpotQA |63.30|68.84|68.62|72.65|79.04|79.98|75.61|
204
+ NFCorpus |33.75|36.69|36.38|36.48|32.71|36.57|37.73|
205
+ NQ |30.55|51.27|51.01|52.01|53.55|56.81|56.82|
206
+ Quora |78.86|85.18|82.75|82.45|78.44|81.06|87.31|
207
+ SCIDOCS |14.90|15.39|16.67|16.28|15.06|16.84|19.56|
208
+ SciFact |67.89|70.23|70.95|69.53|70.62|74.14|75.01|
209
+ TREC-COVID |59.47|75.00|76.89|74.45|67.46|74.32|82.09|
210
+ Webis-touche2020|44.22|32.12|32.56|28.40|34.37|35.66|31.62|
211
+ Average |43.08|49.82|50.25|49.78|49.84|52.57|**54.23**|
212
+
213
+ ColBERT
214
 
 
215
 
216
  ## Plans
217
 
218
+ We are planning to improve the performance of Jina-ColBERT by fine-tuning on more datasets in the future.
 
219
 
220
  ## Other Models
221