Optimum documentation

Quantization

You are viewing main version, which requires installation from source. If you'd like regular pip install, checkout the latest stable version (v1.23.3).
Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

Quantization

FuriosaAIQuantizer

class optimum.furiosa.FuriosaAIQuantizer

< >

( model_path: Path config: Optional = None )

Handles the FuriosaAI quantization process for models shared on huggingface.co/models.

compute_ranges

< >

( )

Computes the quantization ranges.

fit

< >

( dataset: Dataset calibration_config: CalibrationConfig batch_size: int = 1 )

Parameters

  • dataset (Dataset) — The dataset to use when performing the calibration step.
  • calibration_config (~CalibrationConfig) — The configuration containing the parameters related to the calibration step.
  • batch_size (int, optional, defaults to 1) — The batch size to use when collecting the quantization ranges values.

Performs the calibration step and computes the quantization ranges.

from_pretrained

< >

( model_or_path: Union file_name: Optional = None )

Parameters

  • model_or_path (Union[FuriosaAIModel, str, Path]) — Can be either:
    • A path to a saved exported ONNX Intermediate Representation (IR) model, e.g., `./my_model_directory/.
    • Or an FuriosaAIModelModelForXX class, e.g., FuriosaAIModelModelForImageClassification.
  • file_name(Optional[str], optional) — Overwrites the default model file name from "model.onnx" to file_name. This allows you to load different model files from the same repository or directory.

Instantiates a FuriosaAIQuantizer from a model path.

get_calibration_dataset

< >

( dataset_name: str num_samples: int = 100 dataset_config_name: Optional = None dataset_split: Optional = None preprocess_function: Optional = None preprocess_batch: bool = True seed: int = 2016 use_auth_token: bool = False )

Parameters

  • dataset_name (str) — The dataset repository name on the Hugging Face Hub or path to a local directory containing data files to load to use for the calibration step.
  • num_samples (int, optional, defaults to 100) — The maximum number of samples composing the calibration dataset.
  • dataset_config_name (Optional[str], optional) — The name of the dataset configuration.
  • dataset_split (Optional[str], optional) — Which split of the dataset to use to perform the calibration step.
  • preprocess_function (Optional[Callable], optional) — Processing function to apply to each example after loading dataset.
  • preprocess_batch (bool, optional, defaults to True) — Whether the preprocess_function should be batched.
  • seed (int, optional, defaults to 2016) — The random seed to use when shuffling the calibration dataset.
  • use_auth_token (bool, optional, defaults to False) — Whether to use the token generated when running transformers-cli login (necessary for some datasets like ImageNet).

Creates the calibration datasets.Dataset to use for the post-training static quantization calibration step.

partial_fit

< >

( dataset: Dataset calibration_config: CalibrationConfig batch_size: int = 1 )

Parameters

  • dataset (Dataset) — The dataset to use when performing the calibration step.
  • calibration_config (CalibrationConfig) — The configuration containing the parameters related to the calibration step.
  • batch_size (int, optional, defaults to 1) — The batch size to use when collecting the quantization ranges values.

Performs the calibration step and collects the quantization ranges without computing them.

quantize

< >

( quantization_config: QuantizationConfig save_dir: Union file_suffix: Optional = 'quantized' calibration_tensors_range: Optional = None )

Parameters

  • quantization_config (QuantizationConfig) — The configuration containing the parameters related to quantization.
  • save_dir (Union[str, Path]) — The directory where the quantized model should be saved.
  • file_suffix (Optional[str], optional, defaults to "quantized") — The file_suffix used to save the quantized model.
  • calibration_tensors_range (Optional[Dict[NodeName, Tuple[float, float]]], optional) — The dictionary mapping the nodes name to their quantization ranges, used and required only when applying static quantization.

Quantizes a model given the optimization specifications defined in quantization_config.