Post
74
I just released Sentence Transformers v4.1; featuring ONNX and OpenVINO backends for rerankers offering 2-3x speedups and improved hard negatives mining which helps prepare stronger training datasets. Details:
🏎️ ONNX, OpenVINO, Optimization, Quantization
- I've added ONNX and OpenVINO support with just one extra argument: "backend" when loading the CrossEncoder reranker, e.g.:
- The
- I've uploaded ~340 ONNX & OpenVINO models for all existing models under the cross-encoder Hugging Face organization. You can use these without having to export when loading.
⛏ Improved Hard Negatives Mining
- Added 'absolute_margin' and 'relative_margin' arguments to
-
-
- Inspired by the excellent NV-Retriever paper from NVIDIA.
And several other small improvements. Check out the full release notes here: https://github.com/UKPLab/sentence-transformers/releases/tag/v4.1.0
With this release, I introduce near-feature parity between the SentenceTransformer embedding & CrossEncoder reranker models, which I've wanted to do for quite some time! With rerankers very strongly supported now, it's time to look forward to other useful architectures!
🏎️ ONNX, OpenVINO, Optimization, Quantization
- I've added ONNX and OpenVINO support with just one extra argument: "backend" when loading the CrossEncoder reranker, e.g.:
CrossEncoder("cross-encoder/ms-marco-MiniLM-L6-v2", backend="onnx")
- The
export_optimized_onnx_model
, export_dynamic_quantized_onnx_model
, and export_static_quantized_openvino_model
functions now work with CrossEncoder rerankers, allowing you to optimize (e.g. fusions, gelu approximations, etc.) or quantize (int8 weights) rerankers.- I've uploaded ~340 ONNX & OpenVINO models for all existing models under the cross-encoder Hugging Face organization. You can use these without having to export when loading.
⛏ Improved Hard Negatives Mining
- Added 'absolute_margin' and 'relative_margin' arguments to
mine_hard_negatives
. -
absolute_margin
ensures that sim(query, negative) < sim(query, positive) - absolute_margin
, i.e. an absolute margin between the negative & positive similarities.-
relative_margin
ensures that sim(query, negative) < sim(query, positive) * (1 - relative_margin)
, i.e. a relative margin between the negative & positive similarities.- Inspired by the excellent NV-Retriever paper from NVIDIA.
And several other small improvements. Check out the full release notes here: https://github.com/UKPLab/sentence-transformers/releases/tag/v4.1.0
With this release, I introduce near-feature parity between the SentenceTransformer embedding & CrossEncoder reranker models, which I've wanted to do for quite some time! With rerankers very strongly supported now, it's time to look forward to other useful architectures!