michaelfeil
/

bge-small-en-v1.5

Feature Extraction

sentence-transformers

sentence-similarity

Inference Endpoints

text-embeddings-inference

Model card Files Files and versions Community

michaelfeil commited on Mar 14

Commit

c256d82

•

1 Parent(s): b6f0cde

Update README.md

Files changed (1) hide show

README.md +9 -5

README.md CHANGED Viewed

@@ -17,11 +17,9 @@ More details please refer to the Github: [Infinity](https://github.com/michaelfe
-## Usage
-### Usage for Embedding Model via infinity
-Its also possible to deploy files with the [infinity_emb](https://github.com/michaelfeil/infinity) pip package.
 Recommended is `device="cuda", engine="torch"` with flash attention on gpu, and `device="cpu", engine="optimum"` for onnx inference.
 ```python
@@ -30,7 +28,13 @@ from infinity_emb import AsyncEmbeddingEngine, EngineArgs
 sentences = ["Embed this is sentence via Infinity.", "Paris is in France."]
 engine = AsyncEmbeddingEngine.from_args(
-    EngineArgs(model_name_or_path = "BAAI/bge-small-en-v1.5", device="cpu", engine="optimum" # or engine="torch"
 ))
 async def main():

+##  Usage for Embedding Model via infinity
+To deploy files with the [infinity_emb](https://github.com/michaelfeil/infinity) pip package.
 Recommended is `device="cuda", engine="torch"` with flash attention on gpu, and `device="cpu", engine="optimum"` for onnx inference.
 ```python
 sentences = ["Embed this is sentence via Infinity.", "Paris is in France."]
 engine = AsyncEmbeddingEngine.from_args(
+    EngineArgs(
+        model_name_or_path = "michaelfeil/bge-small-en-v1.5",
+        device="cpu",
+        # or device="cuda"
+        engine="torch"
+        # or engine="optimum"
+        compile=True # enable torch.compile
 ))
 async def main():