prithivida
/

miniDense_hindi_v1

Sentence Similarity

sentence-transformers

feature-extraction

passage-retrieval

knowledge-distillation

middle-training

text-embeddings-inference

Inference Endpoints

Model card Files Files and versions

prithivida commited on Jun 5

Commit

35633b0

•

1 Parent(s): 89db75b

Update README.md

Files changed (1) hide show

README.md +9 -2

README.md CHANGED Viewed

@@ -43,7 +43,7 @@ pipeline_tag: sentence-similarity
 - [License and Terms:](#license-and-terms)
 - [Detailed comparison & Our Contribution:](#detailed-comparison--our-contribution)
-- [ONNX & GGUF Variants:](#detailed-comparison--our-contribution)
 - [Usage:](#usage)
     - [With Sentence Transformers:](#with-sentence-transformers)
     - [With Huggingface Transformers:](#with-huggingface-transformers)
@@ -90,6 +90,13 @@ Full set of evaluation numbers for our model
 <br/>
 # Usage:
 #### With Sentence Transformers:
@@ -149,7 +156,7 @@ for query, query_embedding in zip(queries, query_embeddings):
 # FAQs:
 #### How can I reduce overall inference cost ?
-- You can host these models without heavy torch dependency using the ONNX flavours of these models via [FlashRetrieve](https://github.com/PrithivirajDamodaran/FlashRetrieve) library.
 #### How do I reduce vector storage cost ?

 - [License and Terms:](#license-and-terms)
 - [Detailed comparison & Our Contribution:](#detailed-comparison--our-contribution)
+- [ONNX & GGUF Status:](#onnx-gguf-status)
 - [Usage:](#usage)
     - [With Sentence Transformers:](#with-sentence-transformers)
     - [With Huggingface Transformers:](#with-huggingface-transformers)
 <br/>
+# ONNX & GGUF Status:
+|Variant| Status |
+|:---:|:---:|
+|FP16 ONNX | ✅ |
+|GGUF | WIP|
 # Usage:
 #### With Sentence Transformers:
 # FAQs:
 #### How can I reduce overall inference cost ?
+- You can host these models without heavy torch dependency using the ONNX flavours of these models via [FlashEmbed](https://github.com/PrithivirajDamodaran/flashembed) library.
 #### How do I reduce vector storage cost ?