onnx-community/Qwen2-VL-2B-Instruct · Q4F16 model throws errors

ONNX Community org 16 days ago

•

Hi @Xenova , thanks for this. Did some tests with the q4f16 exports and hitting the same errors I had on my export:

An uncaught WebGPU validation error was raised: Error while parsing WGSL: :51:15 error: return statement type must match its function return type, returned 'f16', expected 'f32'
              return get_xByIndices(aIndices);
              ^^^^^^


 - While validating [ShaderModuleDescriptor ""Conv3DNaive""]
 - While calling [Device].CreateShaderModule([ShaderModuleDescriptor ""Conv3DNaive""]).

Test Instructions:

Visit https://huggingface.co/spaces/pdufour/Qwen2VL_TransformersJS_Demo (space I setup to use this model)
Select q4f16 model to load
Select example image and type in text and hit enter

Actual results

See error mentioned above

Expected results

Should process query

I worked around this on my onnx export but adding this op to op_block_list https://huggingface.co/pdufour/Qwen2-VL-2B-Instruct-ONNX-Q4-F16/blob/main/Makefile#L126 but ideally this operation should be supported by webgpu.

It's this op here specifically that is causing the error:

Xenova

ONNX Community org 15 days ago

Thanks for testing! cc @schmuell since this seems to be a bug with the WebGPU EP. The model runs correctly on CPU.

On that note, would you mind opening a bug report to https://github.com/microsoft/onnxruntime?

pdufour

ONNX Community org 15 days ago

https://github.com/microsoft/onnxruntime/issues/22974