infinity usage of reranking. Implements a cohere compatible api.
#10
by
michaelfeil
- opened
README.md
CHANGED
@@ -126,6 +126,14 @@ with torch.no_grad():
|
|
126 |
# tensor([1.2315, 0.5923, 0.3041])
|
127 |
```
|
128 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
129 |
|
130 |
## Evaluation
|
131 |
|
|
|
126 |
# tensor([1.2315, 0.5923, 0.3041])
|
127 |
```
|
128 |
|
129 |
+
Usage with infinity:
|
130 |
+
|
131 |
+
[Infinity](https://github.com/michaelfeil/infinity), a MIT Licensed Inference RestAPI Server.
|
132 |
+
```
|
133 |
+
docker run --gpus all -v $PWD/data:/app/.cache -p "7997":"7997" \
|
134 |
+
michaelf34/infinity:0.0.68 \
|
135 |
+
v2 --model-id Alibaba-NLP/gte-multilingual-reranker-base --revision "main" --dtype bfloat16 --batch-size 32 --device cuda --engine torch --port 7997
|
136 |
+
```
|
137 |
|
138 |
## Evaluation
|
139 |
|