`model.onnx` to `model_fp16.onnx`

#9
by selectorrrr - opened

Hello, I'm new to ONNX, and I'm trying to follow your path. I've successfully exported a model to ONNX, and now I want to reduce the precision to FP16. Could you please show me an example of such a conversion from model.onnx to model_fp16.onnx? Thank you very much

ONNX Community org

Thank you so much for your answer, today I spent half a day playing with this code. This is very interesting.

I had to extend DEFAULT_OP_BLOCK_LIST to solve conversion problems.

But as a result, for some reason the model inference breaks and in response I get an array [np. float16(nan), np. float16(nan), np. float16(nan) ...]

infer_shapes_path('model_fp32.onnx', 'model_fp32_infer_shape.onnx')
model = onnx.load('model_fp32_infer_shape.onnx')

op_block_list = DEFAULT_OP_BLOCK_LIST
op_block_list.append('RandomNormal')
op_block_list.append('RandomNormalLike')

tts_model_fp16 = convert_float_to_float16(model, min_positive_val=1e-7, max_finite_val=1e4,
                                              keep_io_types=True, disable_shape_infer=True,
                                              op_block_list=op_block_list,
                                              node_block_list=[]
                                              )

Right now I'm still trying to find the right parameters for convert_float_to_float16, I'd appreciate any help. Thank you again.

ONNX Community org

Oh right, there are a bunch of nodes I needed to add to the block list. Most of these are in the decoder, which you can see in the following image (see where cast to fp32 nodes are added before certain operations). Hope that helps!

image.png

You can view the graph visualization here.

Sign up or log in to comment