RyzenAIModel

class optimum.amd.ryzenai.RyzenAIModel

( model: InferenceSession config: PretrainedConfig vaip_config: Union = None model_save_dir: Union = None preprocessors: Optional = None **kwargs )

Base class for implementing models using ONNX Runtime.

The RyzenAIModel implements generic methods for interacting with the Hugging Face Hub as well as exporting vanilla transformers models to ONNX using optimum.exporters.onnx toolchain.

Class attributes:

model_type (str, defaults to "onnx_model") — The name of the model type to use when registering the RyzenAIModel classes.
auto_model_class (Type, defaults to AutoModel) — The “AutoModel” class to represented by the current RyzenAIModel class.

Common attributes:

model (ort.InferenceSession) — The ONNX Runtime InferenceSession that is running the model.
config (PretrainedConfig — The configuration of the model.
model_save_dir (Path) — The directory where the model exported to ONNX is saved. By defaults, if the loaded model is local, the directory where the original model will be used. Otherwise, the cache directory is used.
providers (`List[str]) — The list of execution providers available to ONNX Runtime.

from_pretrained

< source >

( model_id: Union vaip_config: str = None export: bool = False force_download: bool = False use_auth_token: Optional = None cache_dir: Optional = None subfolder: str = '' config: Optional = None local_files_only: bool = False provider: str = 'VitisAIExecutionProvider' session_options: Optional = None provider_options: Optional = None **kwargs ) → RyzenAIModel

Parameters

model_id (Union[str, Path]) — Can be either:
- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.
- A path to a directory containing a model saved using ~OptimizedModel.save_pretrained, e.g., ./my_model_directory/.
from_transformers (bool, defaults to False) — Defines whether the provided model_id contains a vanilla Transformers checkpoint.
force_download (bool, defaults to True) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
use_auth_token (Optional[str], defaults to None) — The token to use as HTTP bearer authorization for remote files. If True, will use the token generated when running transformers-cli login (stored in ~/.huggingface).
cache_dir (Optional[str], defaults to None) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
subfolder (str, defaults to "") — In case the relevant files are located inside a subfolder of the model repo either locally or on huggingface.co, you can specify the folder name here.
config (Optional[transformers.PretrainedConfig], defaults to None) — The model configuration.
local_files_only (Optional[bool], defaults to False) — Whether or not to only look at local files (i.e., do not try to download the model).
trust_remote_code (bool, defaults to False) — Whether or not to allow for custom code defined on the Hub in their own modeling. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
provider (str, defaults to "VitisAIExecutionProvider") — ONNX Runtime provider to use for loading the model. See https://onnxruntime.ai/docs/execution-providers/ for possible providers.
session_options (Optional[onnxruntime.SessionOptions], defaults to None), — ONNX Runtime session options to use for loading the model.
provider_options (Optional[Dict[str, Any]], defaults to None) — Provider option dictionaries corresponding to the provider used. See available options for each provider: https://onnxruntime.ai/docs/api/c/group___global.html .
kwargs (Dict[str, Any]) — Will be passed to the underlying model loading methods.

Parameters for decoder models (RyzenAIForSpeechSeq2Seq)

use_cache (Optional[bool], defaults to True) — Whether or not past key/values cache should be used. Defaults to True.

Returns

RyzenAIModel

The loaded RyzenAIModel model.

Instantiate a pretrained model from a pre-trained model configuration.

save_pretrained

< source >

( save_directory: Union push_to_hub: bool = False **kwargs )

Parameters

save_directory (Union[str, os.PathLike]) — Directory to which to save. Will be created if it doesn’t exist.
push_to_hub (bool, optional, defaults to False) — Whether or not to push your model to the Hugging Face model hub after saving it.

Using push_to_hub=True will synchronize the repository you are pushing to with save_directory, which requires save_directory to be a local clone of the repo you are pushing to if it’s an existing folder. Pass along temp_dir=True to use a temporary directory instead.

Saves a model and its configuration file to a directory, so that it can be re-loaded using the from_pretrained class method.

reshape

< source >

( model_path: Union input_shape_dict: Dict output_shape_dict: Dict ) → Union[str, Path]

Parameters

model_path (Union[str, Path]) — Path to the model.
input_shape_dict (Dict[str, Tuple[int]]) — Input shapes for the model.
output_shape_dict (Dict[str, Tuple[int]]) — Output shapes for the model.

Returns

Union[str, Path]

Path to the model after updating the input shapes.

Raises

ValueError

ValueError — If the model provided has dynamic axes in input/output and no input/output shape is provided.

Propagates the given input shapes on the model’s layers, fixing the input shapes of the model.

class optimum.amd.ryzenai.RyzenAIModelForImageClassification

< source >

( model: InferenceSession config: PretrainedConfig vaip_config: Union = None model_save_dir: Union = None preprocessors: Optional = None **kwargs )

Quantization

class optimum.amd.ryzenai.RyzenAIOnnxQuantizer

< source >

( onnx_model_path: Path config: Optional = None )

Handles the RyzenAI quantization process for models shared on huggingface.co/models.

from_pretrained

< source >

( model_or_path: Union file_name: Optional = None )

Parameters

model_or_path (Union[str, Path]) — Can be either:
- A path to a saved exported ONNX Intermediate Representation (IR) model, e.g., `./my_model_directory/.
file_name(Optional[str], defaults to None) — Overwrites the default model file name from "model.onnx" to file_name. This allows you to load different model files from the same repository or directory.

Instantiates a RyzenAIOnnxQuantizer from an ONNX model file.

get_calibration_dataset

< source >

( dataset_name: str num_samples: int = 100 dataset_config_name: Optional = None dataset_split: Optional = None preprocess_function: Optional = None preprocess_batch: bool = True seed: int = 2016 use_auth_token: bool = False )

Parameters

dataset_name (str) — The dataset repository name on the Hugging Face Hub or path to a local directory containing data files to load to use for the calibration step.
num_samples (int, defaults to 100) — The maximum number of samples composing the calibration dataset.
dataset_config_name (Optional[str], defaults to None) — The name of the dataset configuration.
dataset_split (Optional[str], defaults to None) — Which split of the dataset to use to perform the calibration step.
preprocess_function (Optional[Callable], defaults to None) — Processing function to apply to each example after loading dataset.
preprocess_batch (bool, defaults to True) — Whether the preprocess_function should be batched.
seed (int, defaults to 2016) — The random seed to use when shuffling the calibration dataset.
use_auth_token (bool, defaults to False) — Whether to use the token generated when running transformers-cli login (necessary for some datasets like ImageNet).

Creates the calibration datasets.Dataset to use for the post-training static quantization calibration step.

quantize

< source >

( quantization_config: QuantizationConfig dataset: Dataset save_dir: Union batch_size: int = 1 file_suffix: Optional = 'quantized' )

Parameters

quantization_config (QuantizationConfig) — The configuration containing the parameters related to quantization.
save_dir (Union[str, Path]) — The directory where the quantized model should be saved.
file_suffix (Optional[str], defaults to "quantized") — The file_suffix used to save the quantized model.
calibration_tensors_range (Optional[Dict[str, Tuple[float, float]]], defaults to None) — The dictionary mapping the nodes name to their quantization ranges, used and required only when applying static quantization.

Quantizes a model given the optimization specifications defined in quantization_config.

Configuration

class optimum.amd.ryzenai.QuantizationConfig

< source >

( format: QuantFormat = <QuantFormat.QDQ: 1> calibration_method: CalibrationMethod = <PowerOfTwoMethod.MinMSE: 1> activations_dtype: QuantType = <QuantType.QInt8: 0> activations_symmetric: bool = True weights_dtype: QuantType = <QuantType.QInt8: 0> weights_symmetric: bool = True enable_dpu: bool = True )

Parameters

is_static (bool) — Whether to apply static quantization or dynamic quantization.
format (QuantFormat) — Targeted RyzenAI quantization representation format. For the Operator Oriented (QOperator) format, all the quantized operators have their own ONNX definitions. For the Tensor Oriented (QDQ) format, the model is quantized by inserting QuantizeLinear / DeQuantizeLinear operators.
calibration_method (CalibrationMethod) — The method chosen to calculate the activations quantization parameters using the calibration dataset.
activations_dtype (QuantType, defaults to QuantType.QUInt8) — The quantization data types to use for the activations.
activations_symmetric (bool, defaults to False) — Whether to apply symmetric quantization on the activations.
weights_dtype (QuantType, defaults to QuantType.QInt8) — The quantization data types to use for the weights.
weights_symmetric (bool, defaults to True) — Whether to apply symmetric quantization on the weights.
enable_dpu (bool, defaults to True) — Determines whether to generate a quantized model that is suitable for the DPU. If set to True, the quantization process will create a model that is optimized for DPU computations.

QuantizationConfig is the configuration class handling all the RyzenAI quantization parameters.

class optimum.amd.ryzenai.RyzenAIConfig

< source >

( opset: Optional = None quantization: Optional = None **kwargs )

Parameters

opset (Optional[int], defaults to None) — ONNX opset version to export the model with.
quantization (Optional[QuantizationConfig], defaults to None) — Specify a configuration to quantize ONNX model

RyzenAIConfig is the configuration class handling all the VitisAI parameters related to the ONNX IR model export, and quantization parameters.

Optimum

RyzenAIModel

class optimum.amd.ryzenai.RyzenAIModel

from_pretrained

save_pretrained

reshape

class optimum.amd.ryzenai.RyzenAIModelForImageClassification

Quantization

class optimum.amd.ryzenai.RyzenAIOnnxQuantizer

from_pretrained

get_calibration_dataset

quantize

Configuration

class optimum.amd.ryzenai.QuantizationConfig

class optimum.amd.ryzenai.RyzenAIConfig