SamLowe/universal-sentence-encoder-large-5-onnx

Universal Sentence Encoder Large v5

ONNX version of https://tfhub.dev/google/universal-sentence-encoder-large/5

The original TFHub version of the model is referenced in other models here E.g. https://huggingface.co/vprelovac/universal-sentence-encoder-large-5

Overview

See overview and license details at https://tfhub.dev/google/universal-sentence-encoder-large/5

This model is a full precision version of the TFHub original, in ONNX format.

It uses the ONNXRuntime Extensions to embed the tokenizer within the ONNX model, so no seperate tokenizer is needed, and text is fed directly into the ONNX model.

Post-processing (E.g. pooling, normalization) is also implemented within the ONNX model, so no separate processing is necessary.

How to use

import onnxruntime as ort
from onnxruntime_extensions import get_library_path
from os import cpu_count

sentences = ["hello world"]

def load_onnx_model(model_filepath):
  _options = ort.SessionOptions()
  _options.inter_op_num_threads, _options.intra_op_num_threads = cpu_count(), cpu_count()
  _options.register_custom_ops_library(get_library_path())
  _providers = ["CPUExecutionProvider"]  # could use ort.get_available_providers()
  return ort.InferenceSession(path_or_bytes=model_filepath, sess_options=_options, providers=_providers)

model = load_onnx_model("filepath_for_model_dot_onnx")

model_outputs = model.run(output_names=["outputs"], input_feed={"inputs": sentences})[0]
print(model_outputs)