LightEmbed
/

baai-llm-embedder-onnx

Sentence Similarity

sentence-transformers

feature-extraction

Model card Files Files and versions Community

binhcode25 commited on Jun 17

Commit

a3c665b

•

1 Parent(s): cc575c2

Add new SentenceTransformer model.

Files changed (3) hide show

README.md +46 -0
model.onnx +2 -2
tokenizer.json +14 -2

README.md ADDED Viewed

	@@ -0,0 +1,46 @@

+---
+library_name: light-embed
+pipeline_tag: sentence-similarity
+tags:
+- sentence-transformers
+- feature-extraction
+- sentence-similarity
+---
+# baai-llm-embedder-onnx
+This is the ONNX version of the Sentence Transformers model BAAI/llm-embedder for sentence embedding, optimized for speed and lightweight performance. By utilizing onnxruntime and tokenizers instead of heavier libraries like sentence-transformers and transformers, this version ensures a smaller library size and faster execution. Below are the details of the model:
+- Base model: BAAI/llm-embedder
+- Embedding dimension: 768
+- Max sequence length: 512
+- File size on disk:  0.41 GB
+- Pooling incorporated: Yes
+This ONNX model consists all components in the original sentence transformer model:
+Transformer, Pooling, Normalize
+<!--- Describe your model here -->
+## Usage (LightEmbed)
+Using this model becomes easy when you have [LightEmbed](https://pypi.org/project/light-embed/) installed:
+```
+pip install -U light-embed
+```
+Then you can use the model like this:
+```python
+from light_embed import TextEmbedding
+sentences = ["This is an example sentence", "Each sentence is converted"]
+model = TextEmbedding('BAAI/llm-embedder')
+embeddings = model.encode(sentences)
+print(embeddings)
+```
+## Citing & Authors
+Binh Nguyen / binhcode25@gmail.com

model.onnx CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:d6a5c5cb2457a07733e95b8a1480560df251127722b572f85687eb773f80e13e
-size 435917615

 version https://git-lfs.github.com/spec/v1
+oid sha256:e53797c693265afc823292398a7b01cecd75efac0fc6e86a771200e87495abeb
+size 435917673

tokenizer.json CHANGED Viewed

@@ -1,7 +1,19 @@
 {
   "version": "1.0",
-  "truncation": null,
-  "padding": null,
   "added_tokens": [
     {
       "id": 0,

 {
   "version": "1.0",
+  "truncation": {
+    "direction": "Right",
+    "max_length": 256,
+    "strategy": "LongestFirst",
+    "stride": 0
+  },
+  "padding": {
+    "strategy": "BatchLongest",
+    "direction": "Right",
+    "pad_to_multiple_of": null,
+    "pad_id": 0,
+    "pad_type_id": 0,
+    "pad_token": "[PAD]"
+  },
   "added_tokens": [
     {
       "id": 0,