WiseIntelligence
/

universal-sentence-encoder-multilingual-3-onnx-quantized

Model card Files Files and versions Community

universal-sentence-encoder-multilingual-3-onnx-quantized / README.md

WiseIntelligence's picture

WiseIntelligence

Upload 2 files

3517cc1 8 months ago

|

raw history blame contribute delete

No virus

2.31 kB

	This model is a quantized version of: [SamLowe/universal-sentence-encoder-multilingual-3-onnx](https://huggingface.co/SamLowe/universal-sentence-encoder-multilingual-3-onnx)

	---

	languages:

	* en
	* ar
	* zh
	* fr
	* de
	* it
	* ja
	* ko
	* nl
	* pl
	* pt
	* es
	* th
	* tr
	* ru
	tags:
	* feature-extraction
	* onnx
	* use
	* text-embedding
	* tensorflow-hub
	license: apache-2.0
	inference: false
	widget:
	* text: Thank goodness ONNX is available, it is lots faster!

	---

	### Universal Sentence Encoder Multilingual v3

	ONNX version of [https://tfhub.dev/google/universal-sentence-encoder-multilingual/3](https://tfhub.dev/google/universal-sentence-encoder-multilingual/3)

	The original TFHub version of the model is referenced in other models here E.g. [https://huggingface.co/vprelovac/universal-sentence-encoder-multilingual-3](https://huggingface.co/vprelovac/universal-sentence-encoder-multilingual-3)

	### Overview

	See overview and license details at [https://tfhub.dev/google/universal-sentence-encoder-multilingual/3](https://tfhub.dev/google/universal-sentence-encoder-multilingual/3)

	This model is a full precision version of the TFHub original, in ONNX format.

	It uses the [ONNXRuntime Extensions](https://github.com/microsoft/onnxruntime-extensions) to embed the tokenizer within the ONNX model, so no seperate tokenizer is needed, and text is fed directly into the ONNX model.

	Post-processing (E.g. pooling, normalization) is also implemented within the ONNX model, so no separate processing is necessary.

	### How to use

	```python
	import onnxruntime as ort
	from onnxruntime_extensions import get_library_path
	from os import cpu_count

	sentences = ["hello world"]

	def load_onnx_model(model_filepath):
	_options = ort.SessionOptions()
	_options.inter_op_num_threads, _options.intra_op_num_threads = cpu_count(), cpu_count()
	_options.register_custom_ops_library(get_library_path())
	_providers = ["CPUExecutionProvider"] # could use ort.get_available_providers()
	return ort.InferenceSession(path_or_bytes=model_filepath, sess_options=_options, providers=_providers)

	model = load_onnx_model("filepath_for_model_dot_onnx")

	model_outputs = model.run(output_names=["outputs"], input_feed={"inputs": sentences})[0]
	print(model_outputs)
	```