inconsistent with the original model

by yutiangong - opened Mar 27

Mar 27

Could you provide the conversion code? I infer with the converted model, but the results were inconsistent with the original model. thank you

hotchpotch

Owner Mar 27

@yutiangong

Hi, same: https://huggingface.co/hotchpotch/vespa-onnx-intfloat-multilingual-e5-large/blob/main/README.md#tips-conver-to-int8-quantized

Replacing it with BAAI/bge-m3 would work.

hotchpotch

Owner Mar 27

Note that because of the Int8 quantization, the values of the original model and the converted embeddings will differ to some extent.

yutiangong

Mar 27

thank you ,but after i convert to int 16, it reported an error when infering ,the error is
'onnxruntime.capi.onnxruntime_pybind11_state.InvalidGraph: [ONNXRuntimeError] : 10 : INVALID_GRAPH : Load model from /bge_m3_onnx_down/BAAI-bge-m3_quantized_fp16.onnx failed:This is an invalid model. Type Error: Type 'tensor(float16)' of input parameter (embeddings.position_embeddings.weight_scale) of operator (DequantizeLinear) in node (/embeddings/position_embeddings/Gather_output_0_DequantizeLinear) is invalid.
'

yutiangong

Mar 27

Could you upload the original convertion code ? i thank it will help more guys....

hotchpotch

Owner Mar 27

•

edited Mar 27

That is because your onnx runtime does not support fp16. running it with fp32 instead of changing to int8 may fix it.

The model I am publishing is for running onnx on a search engine called vespa, and does not take into account other environments. The behavior of onnx varies considerably from runtime to runtime.

yutiangong

Mar 28

Thank you,but It not work,after i converted to fp32. Actually my problem is when i conver bge m3 .bin model to onnx, .onnx model and .onnx.datal generated,i only need the .onnx version

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment