Models

class optimum.onnxruntime.ORTModel

( model: InferenceSession config: PretrainedConfig use_io_binding: typing.Optional[bool] = None model_save_dir: typing.Union[str, pathlib.Path, tempfile.TemporaryDirectory, NoneType] = None preprocessors: typing.Optional[typing.List] = None **kwargs )

Base class for implementing models using ONNX Runtime.

The ORTModel implements generic methods for interacting with the Hugging Face Hub as well as exporting vanilla transformers models to ONNX using optimum.exporters.onnx toolchain.

Class attributes:

model_type (str, optional, defaults to "onnx_model") — The name of the model type to use when registering the ORTModel classes.
auto_model_class (Type, optional, defaults to AutoModel) — The “AutoModel” class to represented by the current ORTModel class.

Common attributes:

model (ort.InferenceSession) — The ONNX Runtime InferenceSession that is running the model.
config (PretrainedConfig — The configuration of the model.
use_io_binding (bool, optional, defaults to True) — Whether to use I/O bindings with ONNX Runtime with the CUDAExecutionProvider, this can significantly speedup inference depending on the task.
model_save_dir (Path) — The directory where the model exported to ONNX is saved. By defaults, if the loaded model is local, the directory where the original model will be used. Otherwise, the cache directory is used.
providers (`List[str]) — The list of execution providers available to ONNX Runtime.

Optimum

Models

Generic model classes

ORTModel

class optimum.onnxruntime.ORTModel

can_generate

from_pretrained

load_model

raise_on_numpy_input_io_binding

shared_attributes_init

to

Natural Language Processing

ORTModelForCausalLM

class optimum.onnxruntime.ORTModelForCausalLM

forward

ORTModelForMaskedLM

class optimum.onnxruntime.ORTModelForMaskedLM

forward

ORTModelForSeq2SeqLM

class optimum.onnxruntime.ORTModelForSeq2SeqLM

forward

ORTModelForSequenceClassification

class optimum.onnxruntime.ORTModelForSequenceClassification

forward

ORTModelForTokenClassification

class optimum.onnxruntime.ORTModelForTokenClassification

forward

ORTModelForMultipleChoice

class optimum.onnxruntime.ORTModelForMultipleChoice

forward

ORTModelForQuestionAnswering

class optimum.onnxruntime.ORTModelForQuestionAnswering

forward

Computer vision

ORTModelForImageClassification

class optimum.onnxruntime.ORTModelForImageClassification

forward

ORTModelForSemanticSegmentation

class optimum.onnxruntime.ORTModelForSemanticSegmentation

forward

Audio

ORTModelForAudioClassification

class optimum.onnxruntime.ORTModelForAudioClassification

forward

ORTModelForAudioFrameClassification

class optimum.onnxruntime.ORTModelForAudioFrameClassification

forward

ORTModelForCTC

class optimum.onnxruntime.ORTModelForCTC

forward

ORTModelForSpeechSeq2Seq

class optimum.onnxruntime.ORTModelForSpeechSeq2Seq

forward

ORTModelForAudioXVector

class optimum.onnxruntime.ORTModelForAudioXVector

forward

Multimodal

ORTModelForVision2Seq

class optimum.onnxruntime.ORTModelForVision2Seq

forward

ORTModelForPix2Struct

class optimum.onnxruntime.ORTModelForPix2Struct

forward

Custom Tasks

ORTModelForCustomTasks

class optimum.onnxruntime.ORTModelForCustomTasks

forward

ORTModelForFeatureExtraction

class optimum.onnxruntime.ORTModelForFeatureExtraction

forward

Stable Diffusion

ORTDiffusionPipeline

class optimum.utils.dummy_diffusers_objects.ORTDiffusionPipeline

__call__

ORTStableDiffusionPipeline

class optimum.utils.dummy_diffusers_objects.ORTStableDiffusionPipeline

__call__

ORTStableDiffusionImg2ImgPipeline

class optimum.utils.dummy_diffusers_objects.ORTStableDiffusionImg2ImgPipeline

__call__

call

call

call

call

call

call

call