google-t5/t5-small · Can I get the script to run an inference through T5 ONNX using `onnxruntime`?

Sep 6, 2023

•

edited Sep 6, 2023

I am glad that the community has exported the "ONNX" files for this model. I came to know that I will have to seperately use the "encoder_model.onnx" and "decoder_model.onnx" to make a successful forward pass.

I am unable to find any proper guide to run an inferance through such encoder-decoder model using onnxruntime library.
Can anyone please help me through this?

🎯 Objective

My objective is to summarize the given text. I am not sure how to perform a successful inference. In the past I have worked with onnxruntime with GPT2 but that is causal LM model and here it is different.

(I am totally fine if you could provide a code in Java/Python)

I would highly appreciate your help 🙏🏻
Thank you so much.

lysandre

T5 community org Sep 6, 2023

Hey @AayushShah , I think this question might be better suited for the optimum issues as not linked to t5

AayushShah

Sep 7, 2023

@lysandre you are right mate, apologies 🙏😉

fxmarty

Sep 11, 2023

Hi @AayushShah , Optimum library has a ORTModelForSeq2SeqLM class that leverages ONNX Runtime and may be relevant to you:
https://huggingface.co/docs/optimum/main/en/onnxruntime/usage_guides/models#sequencetosequence-models
https://huggingface.co/docs/optimum/main/en/onnxruntime/package_reference/modeling_ort#optimum.onnxruntime.ORTModelForSeq2SeqLM