michaelfeil commited on
Commit
c256d82
1 Parent(s): b6f0cde

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -5
README.md CHANGED
@@ -17,11 +17,9 @@ More details please refer to the Github: [Infinity](https://github.com/michaelfe
17
 
18
 
19
 
20
- ## Usage
21
 
22
- ### Usage for Embedding Model via infinity
23
-
24
- Its also possible to deploy files with the [infinity_emb](https://github.com/michaelfeil/infinity) pip package.
25
  Recommended is `device="cuda", engine="torch"` with flash attention on gpu, and `device="cpu", engine="optimum"` for onnx inference.
26
 
27
  ```python
@@ -30,7 +28,13 @@ from infinity_emb import AsyncEmbeddingEngine, EngineArgs
30
 
31
  sentences = ["Embed this is sentence via Infinity.", "Paris is in France."]
32
  engine = AsyncEmbeddingEngine.from_args(
33
- EngineArgs(model_name_or_path = "BAAI/bge-small-en-v1.5", device="cpu", engine="optimum" # or engine="torch"
 
 
 
 
 
 
34
  ))
35
 
36
  async def main():
 
17
 
18
 
19
 
20
+ ## Usage for Embedding Model via infinity
21
 
22
+ To deploy files with the [infinity_emb](https://github.com/michaelfeil/infinity) pip package.
 
 
23
  Recommended is `device="cuda", engine="torch"` with flash attention on gpu, and `device="cpu", engine="optimum"` for onnx inference.
24
 
25
  ```python
 
28
 
29
  sentences = ["Embed this is sentence via Infinity.", "Paris is in France."]
30
  engine = AsyncEmbeddingEngine.from_args(
31
+ EngineArgs(
32
+ model_name_or_path = "michaelfeil/bge-small-en-v1.5",
33
+ device="cpu",
34
+ # or device="cuda"
35
+ engine="torch"
36
+ # or engine="optimum"
37
+ compile=True # enable torch.compile
38
  ))
39
 
40
  async def main():