hotchpotch
/

vespa-onnx-intfloat-multilingual-e5-large

Feature Extraction

text-embeddings-inference

Inference Endpoints

Model card Files Files and versions Community

hotchpotch commited on Apr 2

Commit

c7ca07e

•

1 Parent(s): a4a8b62

Update README.md

Files changed (1) hide show

README.md +7 -2

README.md CHANGED Viewed

@@ -29,6 +29,13 @@ The model was quantized using the [optimum](https://github.com/huggingface/optim
 </component>
 ```
 ## Tips: conver to int8 quantized
@@ -60,8 +67,6 @@ model_fp16 = convert_float_to_float16(onnx_model, disable_shape_infer=True)
 onnx.save(model_fp16, "me5-large/intfloat-multilingual-e5-large_fp16.onnx")
 ```
 ## License
 The license for this model is based on the original license (found in the LICENSE file in the project's root directory), which is the MIT License.

 </component>
 ```
+### deploy
+```
+# FP16 model has a larger file size, which can result in longer deployment times.
+vespa deploy --wait 1800 .
+```
 ## Tips: conver to int8 quantized
 onnx.save(model_fp16, "me5-large/intfloat-multilingual-e5-large_fp16.onnx")
 ```
 ## License
 The license for this model is based on the original license (found in the LICENSE file in the project's root directory), which is the MIT License.