prithivida
commited on
Commit
•
35633b0
1
Parent(s):
89db75b
Update README.md
Browse files
README.md
CHANGED
@@ -43,7 +43,7 @@ pipeline_tag: sentence-similarity
|
|
43 |
|
44 |
- [License and Terms:](#license-and-terms)
|
45 |
- [Detailed comparison & Our Contribution:](#detailed-comparison--our-contribution)
|
46 |
-
- [ONNX & GGUF
|
47 |
- [Usage:](#usage)
|
48 |
- [With Sentence Transformers:](#with-sentence-transformers)
|
49 |
- [With Huggingface Transformers:](#with-huggingface-transformers)
|
@@ -90,6 +90,13 @@ Full set of evaluation numbers for our model
|
|
90 |
|
91 |
<br/>
|
92 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
93 |
# Usage:
|
94 |
|
95 |
#### With Sentence Transformers:
|
@@ -149,7 +156,7 @@ for query, query_embedding in zip(queries, query_embeddings):
|
|
149 |
# FAQs:
|
150 |
|
151 |
#### How can I reduce overall inference cost ?
|
152 |
-
- You can host these models without heavy torch dependency using the ONNX flavours of these models via [
|
153 |
|
154 |
|
155 |
#### How do I reduce vector storage cost ?
|
|
|
43 |
|
44 |
- [License and Terms:](#license-and-terms)
|
45 |
- [Detailed comparison & Our Contribution:](#detailed-comparison--our-contribution)
|
46 |
+
- [ONNX & GGUF Status:](#onnx-gguf-status)
|
47 |
- [Usage:](#usage)
|
48 |
- [With Sentence Transformers:](#with-sentence-transformers)
|
49 |
- [With Huggingface Transformers:](#with-huggingface-transformers)
|
|
|
90 |
|
91 |
<br/>
|
92 |
|
93 |
+
# ONNX & GGUF Status:
|
94 |
+
|
95 |
+
|Variant| Status |
|
96 |
+
|:---:|:---:|
|
97 |
+
|FP16 ONNX | ✅ |
|
98 |
+
|GGUF | WIP|
|
99 |
+
|
100 |
# Usage:
|
101 |
|
102 |
#### With Sentence Transformers:
|
|
|
156 |
# FAQs:
|
157 |
|
158 |
#### How can I reduce overall inference cost ?
|
159 |
+
- You can host these models without heavy torch dependency using the ONNX flavours of these models via [FlashEmbed](https://github.com/PrithivirajDamodaran/flashembed) library.
|
160 |
|
161 |
|
162 |
#### How do I reduce vector storage cost ?
|