michaelfeil commited on
Commit
c5e53d3
1 Parent(s): c256d82

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -4
README.md CHANGED
@@ -12,16 +12,21 @@ language:
12
 
13
  <h1 align="center">Infinity Embedding Model</h1>
14
 
 
15
 
16
- More details please refer to the Github: [Infinity](https://github.com/michaelfeil/infinity).
 
 
17
 
 
18
 
19
 
20
- ## Usage for Embedding Model via infinity
21
 
22
  To deploy files with the [infinity_emb](https://github.com/michaelfeil/infinity) pip package.
23
  Recommended is `device="cuda", engine="torch"` with flash attention on gpu, and `device="cpu", engine="optimum"` for onnx inference.
24
 
 
25
  ```python
26
  import asyncio
27
  from infinity_emb import AsyncEmbeddingEngine, EngineArgs
@@ -30,8 +35,8 @@ sentences = ["Embed this is sentence via Infinity.", "Paris is in France."]
30
  engine = AsyncEmbeddingEngine.from_args(
31
  EngineArgs(
32
  model_name_or_path = "michaelfeil/bge-small-en-v1.5",
33
- device="cpu",
34
- # or device="cuda"
35
  engine="torch"
36
  # or engine="optimum"
37
  compile=True # enable torch.compile
@@ -43,6 +48,15 @@ async def main():
43
  asyncio.run(main())
44
  ```
45
 
 
 
 
 
 
 
 
 
 
46
 
47
  ## Contact
48
  If you have any question or suggestion related to this project, feel free to open an issue or pull request.
 
12
 
13
  <h1 align="center">Infinity Embedding Model</h1>
14
 
15
+ This is the stable default model for infinity.
16
 
17
+ ```bash
18
+ pip install infinity_emb[all]
19
+ ```
20
 
21
+ More details about the infinity inference project please refer to the Github: [Infinity](https://github.com/michaelfeil/infinity).
22
 
23
 
24
+ ## Usage for Embedding Model via infinity in Python
25
 
26
  To deploy files with the [infinity_emb](https://github.com/michaelfeil/infinity) pip package.
27
  Recommended is `device="cuda", engine="torch"` with flash attention on gpu, and `device="cpu", engine="optimum"` for onnx inference.
28
 
29
+
30
  ```python
31
  import asyncio
32
  from infinity_emb import AsyncEmbeddingEngine, EngineArgs
 
35
  engine = AsyncEmbeddingEngine.from_args(
36
  EngineArgs(
37
  model_name_or_path = "michaelfeil/bge-small-en-v1.5",
38
+ device="cuda",
39
+ # or device="cpu"
40
  engine="torch"
41
  # or engine="optimum"
42
  compile=True # enable torch.compile
 
48
  asyncio.run(main())
49
  ```
50
 
51
+ ## CLI interface
52
+
53
+ The same args
54
+
55
+ ```bash
56
+ pip install infinity_emb
57
+ infinity_emb --model-name-or-path michaelfeil/bge-small-en-v1.5 --port 7997
58
+ ```
59
+
60
 
61
  ## Contact
62
  If you have any question or suggestion related to this project, feel free to open an issue or pull request.