inconsistent with the original model
Could you provide the conversion code? I infer with the converted model, but the results were inconsistent with the original model. thank you
Replacing it with BAAI/bge-m3 would work.
Note that because of the Int8 quantization, the values of the original model and the converted embeddings will differ to some extent.
thank you ,but after i convert to int 16, it reported an error when infering ,the error is
'onnxruntime.capi.onnxruntime_pybind11_state.InvalidGraph: [ONNXRuntimeError] : 10 : INVALID_GRAPH : Load model from /bge_m3_onnx_down/BAAI-bge-m3_quantized_fp16.onnx failed:This is an invalid model. Type Error: Type 'tensor(float16)' of input parameter (embeddings.position_embeddings.weight_scale) of operator (DequantizeLinear) in node (/embeddings/position_embeddings/Gather_output_0_DequantizeLinear) is invalid.
'
Could you upload the original convertion code ? i thank it will help more guys....
That is because your onnx runtime does not support fp16. running it with fp32 instead of changing to int8 may fix it.
The model I am publishing is for running onnx on a search engine called vespa, and does not take into account other environments. The behavior of onnx varies considerably from runtime to runtime.
Thank you,but It not work,after i converted to fp32. Actually my problem is when i conver bge m3 .bin model to onnx, .onnx model and .onnx.datal generated,i only need the .onnx version