swulling commited on
Commit
a7885b0
1 Parent(s): afba296

update README.md

Browse files
Files changed (1) hide show
  1. README.md +33 -0
README.md CHANGED
@@ -1,3 +1,36 @@
1
  ---
2
  license: mit
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
  ---
4
+
5
+ # ONNX GPU Runtime with O4 for BAAI/bge-reranker-large
6
+
7
+ benchmark: https://colab.research.google.com/drive/1HP9GQKdzYa6H9SJnAZoxJWq920gxwd2k
8
+
9
+ ## Convert
10
+
11
+ ```bash
12
+ !optimum-cli export onnx -m BAAI/bge-reranker-large --optimize O4 bge-reranker-large-onnx-o4 --device cuda
13
+ ```
14
+
15
+ ## Usage
16
+
17
+ ```python
18
+ # pip install "optimum[onnxruntime-gpu]" transformers
19
+
20
+ from optimum.onnxruntime import ORTModelForSequenceClassification
21
+ from transformers import AutoTokenizer
22
+
23
+ tokenizer = AutoTokenizer.from_pretrained('swulling/bge-reranker-large-onnx-o4')
24
+ model = ORTModelForSequenceClassification.from_pretrained('swulling/bge-reranker-large-onnx-o4')
25
+ model.to("cuda")
26
+
27
+ with torch.no_grad():
28
+ inputs = tokenizer(pairs, padding=True, truncation=True, return_tensors='pt', max_length=512)
29
+ scores = model(**inputs, return_dict=True).logits.view(-1, ).float()
30
+ print(scores)
31
+ ```
32
+
33
+ ## Source model
34
+
35
+ https://huggingface.co/BAAI/bge-reranker-large
36
+