This model is a version of the BGE-M3 model converted to ONNX weights with HF Optimum for compatibility with ONNX Runtime.
It is based on the conversion scripts and the documentation of the bge-m3-onnx model by Aapo Tanskanen.
This ONNX model outputs dense and ColBERT embedding representations all at once. The output is a list of numpy arrays in previously mentioned order of representations.
Note: dense and ColBERT embeddings are normalized like the default behavior in the original FlagEmbedding library.
Usage with ONNX Runtime (Python)
Install the necessary modules:
pip install huggingface-hub onnxruntime transformers
You can then use the model to compute embeddings, as follows:
from huggingface_hub import hf_hub_download
import onnxruntime as ort
from transformers import AutoTokenizer
hf_hub_download(
repo_id="ddmitov/bge_m3_dense_colbert_onnx",
filename="model.onnx",
local_dir="/tmp",
repo_type="model"
)
hf_hub_download(
repo_id="ddmitov/bge_m3_dense_colbert_onnx",
filename="model.onnx_data",
local_dir="/tmp",
repo_type="model"
)
tokenizer = AutoTokenizer.from_pretrained("ddmitov/bge_m3_dense_colbert_onnx")
ort_session = ort.InferenceSession("/tmp/model.onnx")
inputs = tokenizer(
"BGE M3 is an embedding model supporting dense retrieval and lexical matching.",
padding="longest",
return_tensors="np"
)
inputs_onnx = {
key: ort.OrtValue.ortvalue_from_numpy(value) for key, value in inputs.items()
}
outputs = ort_session.run(None, inputs_onnx)
print(f"Number of Dense Vectors: {len(outputs[0])}")
print(f"Dense Vector Length: {len(outputs[0][0])}")
print("")
print(f"Number of ColBERT Vectors: {len(outputs[1][0])}")
print(f"ColBERT vector length: {len(outputs[1][0][0])}")
# Expected output:
# Number of Dense Vectors: 1
# Dense Vector Length: 1024
# Number of ColBERT Vectors: 24
# ColBERT vector length: 1024
- Downloads last month
- 38
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.