Adding ONNX file of this model

#3
by fa2345 - opened

Beep boop I am the ONNX export bot πŸ€–πŸŽοΈ. On behalf of fa2345, I would like to add to this repository the model converted to ONNX.

What is ONNX? It stands for "Open Neural Network Exchange", and is the most commonly used open standard for machine learning interoperability. You can find out more at onnx.ai!

The exported ONNX model can be then be consumed by various backends as TensorRT or TVM, or simply be used in a few lines with πŸ€— Optimum through ONNX Runtime, check out how here!

Facebook AI community org

@mfuntowicz @fxmarty ok for you to merge?

Sounds good!

julien-c changed pull request status to merged

Hello,

We tried to use the included .onnx model for roberta-large for benchmarking purposes against PyTorch version, but the results were unusual both in inference timings and the model outputs. A simple test shows that somehow the last dimension of the .onnx model's last hidden state if 50625 (the tokenizer's vocab size), while for the PyTorch model it is the expected 1024.

When I create a .onnx model using Optimum all works as expected. A sample test script is attached.

#!/usr/bin/env python
# coding: utf-8

#
# A simple comparison of Roberta-Large model'inference outputs with PyTorch vs ONNX.
#

# Import the needed modules
import time
import numpy as np

import torch
import transformers
from optimum.onnxruntime import ORTModel
import onnxruntime as rt

# Specify the local path to the model checkpoint directory
model_checkpoint = "./roberta-large"
model_name_onnx = "model.onnx"

# Load the Pt Model using .safetensors
model_pt = transformers.AutoModel.from_pretrained(f"{model_checkpoint}", use_safetensors=True);

# Prepare the ONNX model
sess_options = rt.SessionOptions()
sess_options.graph_optimization_level = rt.GraphOptimizationLevel.ORT_DISABLE_ALL
providers=["CPUExecutionProvider"]
rt_session =  rt.InferenceSession(
    f"{model_checkpoint}/{model_name_onnx}",
    providers=providers,
    sess_options=sess_options,
)

# Create a sample input
input_text = "Once upon a time, in a land down under, a drop bear was having a heated discussion with a roo on why he needs to switch to using ONNX instead of PyTorch."

tokenizer = transformers.AutoTokenizer.from_pretrained(model_checkpoint)
encodings_pt   = tokenizer(input_text, return_tensors="pt", max_length=128, truncation=True, padding = False)
encodings_onnx = tokenizer(input_text, return_tensors="np", max_length=128, truncation=True, padding = False)

# Infer outputs using both frameworks
outputs_pt    = model_pt.forward(**encodings_pt).last_hidden_state.detach().cpu().numpy()
outputs_onnx  = rt_session.run(None, dict(encodings_onnx))[0]

# Observe the mismatched output shapes
print("\n" + "-"*40)
print(f"Output shape PT:       {outputs_pt.shape}")
print(f"Output shape ONNX:     {outputs_onnx.shape}")
print("-"*40 + "\n")

# Create an ONNX file using Optimum
ort_model = ORTModel.from_pretrained(model_checkpoint, export=True)
rt_opt_model_path = f"{model_checkpoint}/opt_onnx/"
ort_model.save_pretrained(rt_opt_model_path)

# Load the new ONNX model and run inference on the sample input
rt_session_opt =  rt.InferenceSession(
    f"{rt_opt_model_path}/model.onnx",
    providers=providers,
    sess_options=sess_options,
)
outputs_onnx_opt  = rt_session_opt.run(None, dict(encodings_onnx))[0]

# Compare the output shapes to ensure that the Optimum-created  ONNX model's result is identical to the PyTorch one
print("\n" + "-"*40)
print(f"Output shape PT:       {outputs_pt.shape}")
print(f"Output shape ONNX OPT: {outputs_onnx_opt.shape}")
print(f"All elements close: {np.allclose(outputs_pt[0], outputs_onnx_opt[0], rtol=1e-05, atol=1e-5)}")
print("-"*40 + "\n")

# Sample elements of the hidden states
print("\n" + "-"*40)
print(f"{np.argmax(outputs_pt[0][0])=}")
print(f"{np.argmax(outputs_onnx_opt[0][0])=}")
print(f"{np.argmax(outputs_onnx[0][0])=}")
print("-"*40 + "\n")
Facebook AI community org

@HrayrMSint please wrap your code inside code fences in your comments

@HrayrMSint please wrap your code inside code fences in your comments

@julien-c I have code-fenced and slightly reformatted the sample code in my previous comment.

Thanks a lot for the repro @HrayrMSint πŸ™ - we are going to take a look and get back to you asap on this.

Sign up or log in to comment