Unable to convert ONNX model to INT4/FP16

#95
by Avan2000 - opened

Hi community,

I tried converting the Gemma-7b model to onnx file with fp16 precision using the following command -

optimum-cli export onnx --dtype fp16 --device xpu --model google/gemma-7b --framework pt --task text-generation-with-past ./gemma-7b

But it is giving an error like -

image.png

Later, tried using without fp16 precision as -

optimum-cli export onnx --model google/gemma-7b --framework pt --task text-generation-with-past ./gemma-7b

And successfully got the onnx file in INT64 precision.
Now when I try to convert it to FP16/INT4 it is thowing an error as -

Screenshot 2024-04-24 155241.png

Any clues on what to do about this?

Hi Community, Any update on this ?

Sign up or log in to comment