WiseIntelligence commited on
Commit
3517cc1
1 Parent(s): fc5d53f

Upload 2 files

Browse files
Files changed (2) hide show
  1. README.md +71 -0
  2. embedding_model_quantized.onnx +3 -0
README.md ADDED
@@ -0,0 +1,71 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ This model is a quantized version of: [**SamLowe/universal-sentence-encoder-multilingual-3-onnx**](https://huggingface.co/SamLowe/universal-sentence-encoder-multilingual-3-onnx)
2
+
3
+ ---
4
+
5
+ languages:
6
+
7
+ * en
8
+ * ar
9
+ * zh
10
+ * fr
11
+ * de
12
+ * it
13
+ * ja
14
+ * ko
15
+ * nl
16
+ * pl
17
+ * pt
18
+ * es
19
+ * th
20
+ * tr
21
+ * ru
22
+ tags:
23
+ * feature-extraction
24
+ * onnx
25
+ * use
26
+ * text-embedding
27
+ * tensorflow-hub
28
+ license: apache-2.0
29
+ inference: false
30
+ widget:
31
+ * text: Thank goodness ONNX is available, it is lots faster!
32
+
33
+ ---
34
+
35
+ ### Universal Sentence Encoder Multilingual v3
36
+
37
+ ONNX version of [https://tfhub.dev/google/universal-sentence-encoder-multilingual/3](https://tfhub.dev/google/universal-sentence-encoder-multilingual/3)
38
+
39
+ The original TFHub version of the model is referenced in other models here E.g. [https://huggingface.co/vprelovac/universal-sentence-encoder-multilingual-3](https://huggingface.co/vprelovac/universal-sentence-encoder-multilingual-3)
40
+
41
+ ### Overview
42
+
43
+ See overview and license details at [https://tfhub.dev/google/universal-sentence-encoder-multilingual/3](https://tfhub.dev/google/universal-sentence-encoder-multilingual/3)
44
+
45
+ This model is a full precision version of the TFHub original, in ONNX format.
46
+
47
+ It uses the [ONNXRuntime Extensions](https://github.com/microsoft/onnxruntime-extensions) to embed the tokenizer within the ONNX model, so no seperate tokenizer is needed, and text is fed directly into the ONNX model.
48
+
49
+ Post-processing (E.g. pooling, normalization) is also implemented within the ONNX model, so no separate processing is necessary.
50
+
51
+ ### How to use
52
+
53
+ ```python
54
+ import onnxruntime as ort
55
+ from onnxruntime_extensions import get_library_path
56
+ from os import cpu_count
57
+
58
+ sentences = ["hello world"]
59
+
60
+ def load_onnx_model(model_filepath):
61
+ _options = ort.SessionOptions()
62
+ _options.inter_op_num_threads, _options.intra_op_num_threads = cpu_count(), cpu_count()
63
+ _options.register_custom_ops_library(get_library_path())
64
+ _providers = ["CPUExecutionProvider"] # could use ort.get_available_providers()
65
+ return ort.InferenceSession(path_or_bytes=model_filepath, sess_options=_options, providers=_providers)
66
+
67
+ model = load_onnx_model("filepath_for_model_dot_onnx")
68
+
69
+ model_outputs = model.run(output_names=["outputs"], input_feed={"inputs": sentences})[0]
70
+ print(model_outputs)
71
+ ```
embedding_model_quantized.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b8b05b01f0861094f0475cb288ad1b303ae8aee2ebe96e650d210805d5ca70f7
3
+ size 71794489