prithivida commited on
Commit
35633b0
1 Parent(s): 89db75b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -2
README.md CHANGED
@@ -43,7 +43,7 @@ pipeline_tag: sentence-similarity
43
 
44
  - [License and Terms:](#license-and-terms)
45
  - [Detailed comparison & Our Contribution:](#detailed-comparison--our-contribution)
46
- - [ONNX & GGUF Variants:](#detailed-comparison--our-contribution)
47
  - [Usage:](#usage)
48
  - [With Sentence Transformers:](#with-sentence-transformers)
49
  - [With Huggingface Transformers:](#with-huggingface-transformers)
@@ -90,6 +90,13 @@ Full set of evaluation numbers for our model
90
 
91
  <br/>
92
 
 
 
 
 
 
 
 
93
  # Usage:
94
 
95
  #### With Sentence Transformers:
@@ -149,7 +156,7 @@ for query, query_embedding in zip(queries, query_embeddings):
149
  # FAQs:
150
 
151
  #### How can I reduce overall inference cost ?
152
- - You can host these models without heavy torch dependency using the ONNX flavours of these models via [FlashRetrieve](https://github.com/PrithivirajDamodaran/FlashRetrieve) library.
153
 
154
 
155
  #### How do I reduce vector storage cost ?
 
43
 
44
  - [License and Terms:](#license-and-terms)
45
  - [Detailed comparison & Our Contribution:](#detailed-comparison--our-contribution)
46
+ - [ONNX & GGUF Status:](#onnx-gguf-status)
47
  - [Usage:](#usage)
48
  - [With Sentence Transformers:](#with-sentence-transformers)
49
  - [With Huggingface Transformers:](#with-huggingface-transformers)
 
90
 
91
  <br/>
92
 
93
+ # ONNX & GGUF Status:
94
+
95
+ |Variant| Status |
96
+ |:---:|:---:|
97
+ |FP16 ONNX | ✅ |
98
+ |GGUF | WIP|
99
+
100
  # Usage:
101
 
102
  #### With Sentence Transformers:
 
156
  # FAQs:
157
 
158
  #### How can I reduce overall inference cost ?
159
+ - You can host these models without heavy torch dependency using the ONNX flavours of these models via [FlashEmbed](https://github.com/PrithivirajDamodaran/flashembed) library.
160
 
161
 
162
  #### How do I reduce vector storage cost ?