prithivida commited on
Commit
0bd3af2
1 Parent(s): 835ba8f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -51,7 +51,7 @@ pipeline_tag: sentence-similarity
51
  - [With Sentence Transformers:](#with-sentence-transformers)
52
  - [With Huggingface Transformers:](#with-huggingface-transformers)
53
  - [FAQs](#faqs)
54
- - [How can I reduce overall inference cost ?](#how-can-i-reduce-overall-inference-cost)
55
  - [How do I reduce vector storage cost?](#how-do-i-reduce-vector-storage-cost)
56
  - [How do I offer hybrid search to improve accuracy?](#how-do-i-offer-hybrid-search-to-improve-accuracy)
57
  - [MTEB numbers](#mteb-numbers)
@@ -161,13 +161,13 @@ for query, query_embedding in zip(queries, query_embeddings):
161
 
162
  # FAQs:
163
 
164
- #### How can I reduce overall inference cost ?
165
  - You can host these models without heavy torch dependency using the ONNX flavours of these models via [FlashEmbed](https://github.com/PrithivirajDamodaran/flashembed) library.
166
 
167
- #### How do I reduce vector storage cost ?
168
  [Use Binary and Scalar Quantisation](https://huggingface.co/blog/embedding-quantization)
169
 
170
- #### How do I offer hybrid search to improve accuracy ?
171
  MIRACL paper shows simply combining BM25 is a good starting point for a Hybrid option:
172
  The below numbers are with mDPR model, but miniDense_arabic_v1 should give a even better hybrid performance.
173
 
 
51
  - [With Sentence Transformers:](#with-sentence-transformers)
52
  - [With Huggingface Transformers:](#with-huggingface-transformers)
53
  - [FAQs](#faqs)
54
+ - [How can I reduce overall inference cost?](#how-can-i-reduce-overall-inference-cost)
55
  - [How do I reduce vector storage cost?](#how-do-i-reduce-vector-storage-cost)
56
  - [How do I offer hybrid search to improve accuracy?](#how-do-i-offer-hybrid-search-to-improve-accuracy)
57
  - [MTEB numbers](#mteb-numbers)
 
161
 
162
  # FAQs:
163
 
164
+ #### How can I reduce overall inference cost?
165
  - You can host these models without heavy torch dependency using the ONNX flavours of these models via [FlashEmbed](https://github.com/PrithivirajDamodaran/flashembed) library.
166
 
167
+ #### How do I reduce vector storage cost?
168
  [Use Binary and Scalar Quantisation](https://huggingface.co/blog/embedding-quantization)
169
 
170
+ #### How do I offer hybrid search to improve accuracy?
171
  MIRACL paper shows simply combining BM25 is a good starting point for a Hybrid option:
172
  The below numbers are with mDPR model, but miniDense_arabic_v1 should give a even better hybrid performance.
173