File size: 1,131 Bytes
466f6bd 597d3a1 6bbccaa 597d3a1 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 |
---
license: mit
---
# ONNX GPU Runtime with O4 for BAAI/bge-reranker-base
benchmark: https://colab.research.google.com/drive/1HP9GQKdzYa6H9SJnAZoxJWq920gxwd2k
## Convert
```bash
!optimum-cli export onnx -m BAAI/bge-reranker-base --optimize O4 bge-reranker-base-onnx-o4 --device cuda
```
## Usage
```python
# pip install "optimum[onnxruntime-gpu]" transformers
from optimum.onnxruntime import ORTModelForSequenceClassification
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained('swulling/bge-reranker-base-onnx-o4')
model = ORTModelForSequenceClassification.from_pretrained('swulling/bge-reranker-base-onnx-o4')
model.to("cuda")
pairs = [['what is panda?', 'hi'], ['what is panda?', 'The giant panda (Ailuropoda melanoleuca), sometimes called a panda bear or simply panda, is a bear species endemic to China.']]
with torch.no_grad():
inputs = tokenizer(pairs, padding=True, truncation=True, return_tensors='pt', max_length=512)
scores = model(**inputs, return_dict=True).logits.view(-1, ).float()
print(scores)
```
## Source model
https://huggingface.co/BAAI/bge-reranker-base
|