prithivida
commited on
Commit
•
4ecfcd3
1
Parent(s):
6fdd93b
Update README.md
Browse files
README.md
CHANGED
@@ -44,9 +44,9 @@ pipeline_tag: sentence-similarity
|
|
44 |
- [With Sentence Transformers:](#with-sentence-transformers)
|
45 |
- [With Huggingface Transformers:](#with-huggingface-transformers)
|
46 |
- [FAQs](#faqs)
|
47 |
-
- [How can
|
48 |
-
- [How do I
|
49 |
-
- [How do I offer hybrid search to
|
50 |
- [Why not run MTEB?](#why-not-run-mteb)
|
51 |
- [Roadmap](#roadmap)
|
52 |
- [Notes on Reproducing:](#notes-on-reproducing)
|
@@ -144,14 +144,14 @@ for query, query_embedding in zip(queries, query_embeddings):
|
|
144 |
|
145 |
# FAQs:
|
146 |
|
147 |
-
#### How can
|
148 |
-
- You
|
149 |
|
150 |
|
151 |
-
#### How do I
|
152 |
[Use Binary and Scalar Quantisation](https://huggingface.co/blog/embedding-quantization)
|
153 |
|
154 |
-
|
155 |
MIRACL paper shows simply combining BM25 is a good starting point for a Hybrid option: The below numbers are with mDPR model, but miniMiracle_hi_v1 should give a even better hybrid performance.
|
156 |
|
157 |
| Language | ISO | nDCG@10 BM25 | nDCG@10 mDPR | nDCG@10 Hybrid |
|
|
|
44 |
- [With Sentence Transformers:](#with-sentence-transformers)
|
45 |
- [With Huggingface Transformers:](#with-huggingface-transformers)
|
46 |
- [FAQs](#faqs)
|
47 |
+
- [How can I reduce overall inference cost ?](#how-can-i-reduce-overall-inference-cost)
|
48 |
+
- [How do I reduce vector storage cost?](#how-do-i-reduce-vector-storage-cost)
|
49 |
+
- [How do I offer hybrid search to improve accuracy?](#how-do-i-offer-hybrid-search-to-improve-accuracy)
|
50 |
- [Why not run MTEB?](#why-not-run-mteb)
|
51 |
- [Roadmap](#roadmap)
|
52 |
- [Notes on Reproducing:](#notes-on-reproducing)
|
|
|
144 |
|
145 |
# FAQs:
|
146 |
|
147 |
+
#### How can I reduce overall inference cost ?
|
148 |
+
- You host these models without heavy torch dependency using the ONNX flavours of these models via [FlashRetrieve](https://github.com/PrithivirajDamodaran/FlashRetrieve) library.
|
149 |
|
150 |
|
151 |
+
#### How do I reduce vector storage cost ?
|
152 |
[Use Binary and Scalar Quantisation](https://huggingface.co/blog/embedding-quantization)
|
153 |
|
154 |
+
#### How do I offer hybrid search to improve accuracy ?
|
155 |
MIRACL paper shows simply combining BM25 is a good starting point for a Hybrid option: The below numbers are with mDPR model, but miniMiracle_hi_v1 should give a even better hybrid performance.
|
156 |
|
157 |
| Language | ISO | nDCG@10 BM25 | nDCG@10 mDPR | nDCG@10 Hybrid |
|