michaelfeil
/

bge-small-en-v1.5

Feature Extraction

sentence-transformers

sentence-similarity

text-embeddings-inference

Inference Endpoints

Model card Files Files and versions Community

michaelfeil commited on Mar 14

Commit

c5e53d3

•

1 Parent(s): c256d82

Update README.md

Files changed (1) hide show

README.md +18 -4

README.md CHANGED Viewed

@@ -12,16 +12,21 @@ language:
 <h1 align="center">Infinity Embedding Model</h1>
-More details please refer to the Github: [Infinity](https://github.com/michaelfeil/infinity).
-##  Usage for Embedding Model via infinity
 To deploy files with the [infinity_emb](https://github.com/michaelfeil/infinity) pip package.
 Recommended is `device="cuda", engine="torch"` with flash attention on gpu, and `device="cpu", engine="optimum"` for onnx inference.
 ```python
 import asyncio
 from infinity_emb import AsyncEmbeddingEngine, EngineArgs
@@ -30,8 +35,8 @@ sentences = ["Embed this is sentence via Infinity.", "Paris is in France."]
 engine = AsyncEmbeddingEngine.from_args(
     EngineArgs(
         model_name_or_path = "michaelfeil/bge-small-en-v1.5",
-        device="cpu",
-        # or device="cuda"
         engine="torch"
         # or engine="optimum"
         compile=True # enable torch.compile
@@ -43,6 +48,15 @@ async def main():
 asyncio.run(main())
 ```
 ## Contact
 If you have any question or suggestion related to this project, feel free to open an issue or pull request.

 <h1 align="center">Infinity Embedding Model</h1>
+This is the stable default model for infinity.
+```bash
+pip install infinity_emb[all]
+```
+More details about the infinity inference project please refer to the Github: [Infinity](https://github.com/michaelfeil/infinity).
+##  Usage for Embedding Model via infinity in Python
 To deploy files with the [infinity_emb](https://github.com/michaelfeil/infinity) pip package.
 Recommended is `device="cuda", engine="torch"` with flash attention on gpu, and `device="cpu", engine="optimum"` for onnx inference.
 ```python
 import asyncio
 from infinity_emb import AsyncEmbeddingEngine, EngineArgs
 engine = AsyncEmbeddingEngine.from_args(
     EngineArgs(
         model_name_or_path = "michaelfeil/bge-small-en-v1.5",
+        device="cuda",
+        # or device="cpu"
         engine="torch"
         # or engine="optimum"
         compile=True # enable torch.compile
 asyncio.run(main())
 ```
+## CLI interface
+The same args
+```bash
+pip install infinity_emb
+infinity_emb --model-name-or-path michaelfeil/bge-small-en-v1.5 --port 7997
+```
 ## Contact
 If you have any question or suggestion related to this project, feel free to open an issue or pull request.