Models

Generic model classes

The following ORT classes are available for instantiating a base model class without a specific head.

ORTModel

class optimum.onnxruntime.ORTModel

( *args config: PretrainedConfig = None session: InferenceSession = None use_io_binding: bool | None = None model_save_dir: str | Path | TemporaryDirectory | None = None **kwargs )

Parameters

- config (PretrainedConfig — The configuration of the model. —
- session (~onnxruntime.InferenceSession) — The ONNX Runtime InferenceSession that is running the model. —
- use_io_binding (bool, optional, defaults to True) — Whether to use I/O bindings with **ONNX Runtime —
with the CUDAExecutionProvider**, this can significantly speedup inference depending on the task. —
- model_save_dir (Path) — The directory where the model exported to ONNX is saved. —
By defaults, if the loaded model is local, the directory where the original model will be used. Otherwise, the —
cache directory is used. —

Base class for implementing models using ONNX Runtime.

The ORTModel implements generic methods for interacting with the Hugging Face Hub as well as exporting vanilla transformers models to ONNX using optimum.exporters.onnx toolchain.

Class attributes:

model_type (str, optional, defaults to "onnx_model") — The name of the model type to use when registering the ORTModel classes.
auto_model_class (Type, optional, defaults to AutoModel) — The “AutoModel” class to represented by the current ORTModel class.

optimum-onnx

Models

Generic model classes

ORTModel

class optimum.onnxruntime.ORTModel

can_generate

from_pretrained

Natural Language Processing

ORTModelForCausalLM

class optimum.onnxruntime.ORTModelForCausalLM

forward

ORTModelForMaskedLM

class optimum.onnxruntime.ORTModelForMaskedLM

forward

ORTModelForSeq2SeqLM

class optimum.onnxruntime.ORTModelForSeq2SeqLM

forward

ORTModelForSequenceClassification

class optimum.onnxruntime.ORTModelForSequenceClassification

forward

ORTModelForTokenClassification

class optimum.onnxruntime.ORTModelForTokenClassification

forward

ORTModelForMultipleChoice

class optimum.onnxruntime.ORTModelForMultipleChoice

forward

ORTModelForQuestionAnswering

class optimum.onnxruntime.ORTModelForQuestionAnswering

forward

Computer vision

ORTModelForImageClassification

class optimum.onnxruntime.ORTModelForImageClassification

forward

ORTModelForSemanticSegmentation

class optimum.onnxruntime.ORTModelForSemanticSegmentation

forward

Audio

ORTModelForAudioClassification

class optimum.onnxruntime.ORTModelForAudioClassification

forward

ORTModelForAudioFrameClassification

class optimum.onnxruntime.ORTModelForAudioFrameClassification

forward

ORTModelForCTC

class optimum.onnxruntime.ORTModelForCTC

forward

ORTModelForSpeechSeq2Seq

class optimum.onnxruntime.ORTModelForSpeechSeq2Seq

forward

ORTModelForAudioXVector

class optimum.onnxruntime.ORTModelForAudioXVector

forward

Multimodal

ORTModelForVision2Seq

class optimum.onnxruntime.ORTModelForVision2Seq

forward

ORTModelForPix2Struct

class optimum.onnxruntime.ORTModelForPix2Struct

forward

Custom Tasks

ORTModelForCustomTasks

class optimum.onnxruntime.ORTModelForCustomTasks

forward

ORTModelForFeatureExtraction

class optimum.onnxruntime.ORTModelForFeatureExtraction

forward

Stable Diffusion

ORTDiffusionPipeline

class optimum.utils.dummy_diffusers_objects.ORTDiffusionPipeline

__call__

ORTStableDiffusionPipeline

class optimum.utils.dummy_diffusers_objects.ORTStableDiffusionPipeline

__call__

ORTStableDiffusionImg2ImgPipeline

class optimum.utils.dummy_diffusers_objects.ORTStableDiffusionImg2ImgPipeline

__call__

ORTStableDiffusionInpaintPipeline

class optimum.utils.dummy_diffusers_objects.ORTStableDiffusionInpaintPipeline

__call__

ORTStableDiffusionXLPipeline

call

call

call

call

call

call

call