RyzenAIModel
class optimum.amd.ryzenai.RyzenAIModel
< source >( model: InferenceSession config: PretrainedConfig vaip_config: Union = None model_save_dir: Union = None preprocessors: Optional = None **kwargs )
Base class for implementing models using ONNX Runtime.
The RyzenAIModel implements generic methods for interacting with the Hugging Face Hub as well as exporting vanilla
transformers models to ONNX using optimum.exporters.onnx
toolchain.
Class attributes:
- model_type (
str
, defaults to"onnx_model"
) — The name of the model type to use when registering the RyzenAIModel classes. - auto_model_class (
Type
, defaults toAutoModel
) — The “AutoModel” class to represented by the current RyzenAIModel class.
Common attributes:
- model (
ort.InferenceSession
) — The ONNX Runtime InferenceSession that is running the model. - config (PretrainedConfig — The configuration of the model.
- model_save_dir (
Path
) — The directory where the model exported to ONNX is saved. By defaults, if the loaded model is local, the directory where the original model will be used. Otherwise, the cache directory is used. - providers (`List[str]) — The list of execution providers available to ONNX Runtime.
from_pretrained
< source >( model_id: Union vaip_config: str = None export: bool = False force_download: bool = False use_auth_token: Optional = None cache_dir: Optional = None subfolder: str = '' config: Optional = None local_files_only: bool = False provider: str = 'VitisAIExecutionProvider' session_options: Optional = None provider_options: Optional = None **kwargs ) → RyzenAIModel
Parameters
- model_id (
Union[str, Path]
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
Valid model ids can be located at the root-level, like
bert-base-uncased
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing a model saved using
~OptimizedModel.save_pretrained
, e.g.,./my_model_directory/
.
- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
Valid model ids can be located at the root-level, like
- from_transformers (
bool
, defaults toFalse
) — Defines whether the providedmodel_id
contains a vanilla Transformers checkpoint. - force_download (
bool
, defaults toTrue
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - use_auth_token (
Optional[str]
, defaults toNone
) — The token to use as HTTP bearer authorization for remote files. IfTrue
, will use the token generated when runningtransformers-cli login
(stored in~/.huggingface
). - cache_dir (
Optional[str]
, defaults toNone
) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - subfolder (
str
, defaults to""
) — In case the relevant files are located inside a subfolder of the model repo either locally or on huggingface.co, you can specify the folder name here. - config (
Optional[transformers.PretrainedConfig]
, defaults toNone
) — The model configuration. - local_files_only (
Optional[bool]
, defaults toFalse
) — Whether or not to only look at local files (i.e., do not try to download the model). - trust_remote_code (
bool
, defaults toFalse
) — Whether or not to allow for custom code defined on the Hub in their own modeling. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - provider (
str
, defaults to"VitisAIExecutionProvider"
) — ONNX Runtime provider to use for loading the model. See https://onnxruntime.ai/docs/execution-providers/ for possible providers. - session_options (
Optional[onnxruntime.SessionOptions]
, defaults toNone
), — ONNX Runtime session options to use for loading the model. - provider_options (
Optional[Dict[str, Any]]
, defaults toNone
) — Provider option dictionaries corresponding to the provider used. See available options for each provider: https://onnxruntime.ai/docs/api/c/group___global.html . - kwargs (
Dict[str, Any]
) — Will be passed to the underlying model loading methods.
Parameters for decoder models (RyzenAIForSpeechSeq2Seq)
- use_cache (
Optional[bool]
, defaults toTrue
) — Whether or not past key/values cache should be used. Defaults toTrue
.
Returns
RyzenAIModel
The loaded RyzenAIModel model.
Instantiate a pretrained model from a pre-trained model configuration.
save_pretrained
< source >( save_directory: Union push_to_hub: bool = False **kwargs )
Parameters
- save_directory (
Union[str, os.PathLike]
) — Directory to which to save. Will be created if it doesn’t exist. - push_to_hub (
bool
, optional, defaults toFalse
) — Whether or not to push your model to the Hugging Face model hub after saving it.Using
push_to_hub=True
will synchronize the repository you are pushing to withsave_directory
, which requiressave_directory
to be a local clone of the repo you are pushing to if it’s an existing folder. Pass alongtemp_dir=True
to use a temporary directory instead.
Saves a model and its configuration file to a directory, so that it can be re-loaded using the
from_pretrained
class method.
reshape
< source >( model_path: Union input_shape_dict: Dict output_shape_dict: Dict ) → Union[str, Path]
Parameters
- model_path (Union[str, Path]) — Path to the model.
- input_shape_dict (Dict[str, Tuple[int]]) — Input shapes for the model.
- output_shape_dict (Dict[str, Tuple[int]]) — Output shapes for the model.
Returns
Union[str, Path]
Path to the model after updating the input shapes.
Raises
ValueError
ValueError
— If the model provided has dynamic axes in input/output and no input/output shape is provided.
Propagates the given input shapes on the model’s layers, fixing the input shapes of the model.
class optimum.amd.ryzenai.RyzenAIModelForImageClassification
< source >( model: InferenceSession config: PretrainedConfig vaip_config: Union = None model_save_dir: Union = None preprocessors: Optional = None **kwargs )
Quantization
class optimum.amd.ryzenai.RyzenAIOnnxQuantizer
< source >( onnx_model_path: Path config: Optional = None )
Handles the RyzenAI quantization process for models shared on huggingface.co/models.
from_pretrained
< source >( model_or_path: Union file_name: Optional = None )
Parameters
- model_or_path (
Union[str, Path]
) — Can be either:- A path to a saved exported ONNX Intermediate Representation (IR) model, e.g., `./my_model_directory/.
- file_name(
Optional[str]
, defaults toNone
) — Overwrites the default model file name from"model.onnx"
tofile_name
. This allows you to load different model files from the same repository or directory.
Instantiates a RyzenAIOnnxQuantizer
from an ONNX model file.
get_calibration_dataset
< source >( dataset_name: str num_samples: int = 100 dataset_config_name: Optional = None dataset_split: Optional = None preprocess_function: Optional = None preprocess_batch: bool = True seed: int = 2016 use_auth_token: bool = False )
Parameters
- dataset_name (
str
) — The dataset repository name on the Hugging Face Hub or path to a local directory containing data files to load to use for the calibration step. - num_samples (
int
, defaults to 100) — The maximum number of samples composing the calibration dataset. - dataset_config_name (
Optional[str]
, defaults toNone
) — The name of the dataset configuration. - dataset_split (
Optional[str]
, defaults toNone
) — Which split of the dataset to use to perform the calibration step. - preprocess_function (
Optional[Callable]
, defaults toNone
) — Processing function to apply to each example after loading dataset. - preprocess_batch (
bool
, defaults toTrue
) — Whether thepreprocess_function
should be batched. - seed (
int
, defaults to 2016) — The random seed to use when shuffling the calibration dataset. - use_auth_token (
bool
, defaults toFalse
) — Whether to use the token generated when runningtransformers-cli login
(necessary for some datasets like ImageNet).
Creates the calibration datasets.Dataset
to use for the post-training static quantization calibration step.
quantize
< source >( quantization_config: QuantizationConfig dataset: Dataset save_dir: Union batch_size: int = 1 file_suffix: Optional = 'quantized' )
Parameters
- quantization_config (
QuantizationConfig
) — The configuration containing the parameters related to quantization. - save_dir (
Union[str, Path]
) — The directory where the quantized model should be saved. - file_suffix (
Optional[str]
, defaults to"quantized"
) — The file_suffix used to save the quantized model. - calibration_tensors_range (
Optional[Dict[str, Tuple[float, float]]]
, defaults toNone
) — The dictionary mapping the nodes name to their quantization ranges, used and required only when applying static quantization.
Quantizes a model given the optimization specifications defined in quantization_config
.
Configuration
class optimum.amd.ryzenai.QuantizationConfig
< source >( format: QuantFormat = <QuantFormat.QDQ: 1> calibration_method: CalibrationMethod = <PowerOfTwoMethod.MinMSE: 1> activations_dtype: QuantType = <QuantType.QInt8: 0> activations_symmetric: bool = True weights_dtype: QuantType = <QuantType.QInt8: 0> weights_symmetric: bool = True enable_dpu: bool = True )
Parameters
- is_static (
bool
) — Whether to apply static quantization or dynamic quantization. - format (
QuantFormat
) — Targeted RyzenAI quantization representation format. For the Operator Oriented (QOperator) format, all the quantized operators have their own ONNX definitions. For the Tensor Oriented (QDQ) format, the model is quantized by inserting QuantizeLinear / DeQuantizeLinear operators. - calibration_method (
CalibrationMethod
) — The method chosen to calculate the activations quantization parameters using the calibration dataset. - activations_dtype (
QuantType
, defaults toQuantType.QUInt8
) — The quantization data types to use for the activations. - activations_symmetric (
bool
, defaults toFalse
) — Whether to apply symmetric quantization on the activations. - weights_dtype (
QuantType
, defaults toQuantType.QInt8
) — The quantization data types to use for the weights. - weights_symmetric (
bool
, defaults toTrue
) — Whether to apply symmetric quantization on the weights. - enable_dpu (
bool
, defaults toTrue
) — Determines whether to generate a quantized model that is suitable for the DPU. If set to True, the quantization process will create a model that is optimized for DPU computations.
QuantizationConfig is the configuration class handling all the RyzenAI quantization parameters.
class optimum.amd.ryzenai.RyzenAIConfig
< source >( opset: Optional = None quantization: Optional = None **kwargs )
RyzenAIConfig is the configuration class handling all the VitisAI parameters related to the ONNX IR model export, and quantization parameters.