Q4F16 model throws errors
#4
by
pdufour
- opened
Hi @Xenova , thanks for this. Did some tests with the q4f16 exports and hitting the same errors I had on my export:
An uncaught WebGPU validation error was raised: Error while parsing WGSL: :51:15 error: return statement type must match its function return type, returned 'f16', expected 'f32'
return get_xByIndices(aIndices);
^^^^^^
- While validating [ShaderModuleDescriptor ""Conv3DNaive""]
- While calling [Device].CreateShaderModule([ShaderModuleDescriptor ""Conv3DNaive""]).
Test Instructions:
- Visit https://huggingface.co/spaces/pdufour/Qwen2VL_TransformersJS_Demo (space I setup to use this model)
- Select q4f16 model to load
- Select example image and type in text and hit enter
Actual results
- See error mentioned above
Expected results
- Should process query
I worked around this on my onnx export but adding this op to op_block_list https://huggingface.co/pdufour/Qwen2-VL-2B-Instruct-ONNX-Q4-F16/blob/main/Makefile#L126 but ideally this operation should be supported by webgpu.
Thanks for testing! cc @schmuell since this seems to be a bug with the WebGPU EP. The model runs correctly on CPU.
On that note, would you mind opening a bug report to https://github.com/microsoft/onnxruntime?