Reference
INCQuantizer
class optimum.intel.INCQuantizer
< source >( model: typing.Union[transformers.modeling_utils.PreTrainedModel, torch.nn.modules.module.Module] eval_fn: typing.Union[typing.Callable[[transformers.modeling_utils.PreTrainedModel], int], NoneType] = None calibration_fn: typing.Union[typing.Callable[[transformers.modeling_utils.PreTrainedModel], int], NoneType] = None task: typing.Optional[str] = None seed: int = 42 )
Handle the Neural Compressor quantization process.
get_calibration_dataset
< source >( dataset_name: str num_samples: int = 100 dataset_config_name: typing.Optional[str] = None dataset_split: str = 'train' preprocess_function: typing.Optional[typing.Callable] = None preprocess_batch: bool = True use_auth_token: typing.Union[bool, str, NoneType] = None token: typing.Union[bool, str, NoneType] = None )
Parameters
- dataset_name (
str
) — The dataset repository name on the Hugging Face Hub or path to a local directory containing data files in generic formats and optionally a dataset script, if it requires some code to read the data files. - num_samples (
int
, defaults to 100) — The maximum number of samples composing the calibration dataset. - dataset_config_name (
str
, optional) — The name of the dataset configuration. - dataset_split (
str
, defaults to"train"
) — Which split of the dataset to use to perform the calibration step. - preprocess_function (
Callable
, optional) — Processing function to apply to each example after loading dataset. - preprocess_batch (
bool
, defaults toTrue
) — Whether thepreprocess_function
should be batched. - use_auth_token (Optional[Union[bool, str]], defaults to
None
) — Deprecated. Please usetoken
instead. - token (Optional[Union[bool, str]], defaults to
None
) — The token to use as HTTP bearer authorization for remote files. IfTrue
, will use the token generated when runninghuggingface-cli login
(stored in~/.huggingface
).
Create the calibration datasets.Dataset
to use for the post-training static quantization calibration step.
quantize
< source >( quantization_config: ForwardRef('PostTrainingQuantConfig') save_directory: typing.Union[str, pathlib.Path] calibration_dataset: Dataset = None batch_size: int = 8 data_collator: typing.Optional[DataCollator] = None remove_unused_columns: bool = True file_name: str = None **kwargs )
Parameters
- quantization_config (
Union[PostTrainingQuantConfig]
) — The configuration containing the parameters related to quantization. - save_directory (
Union[str, Path]
) — The directory where the quantized model should be saved. - calibration_dataset (
datasets.Dataset
, defaults toNone
) — The dataset to use for the calibration step, needed for post-training static quantization. - batch_size (
int
, defaults to 8) — The number of calibration samples to load per batch. - data_collator (
DataCollator
, defaults toNone
) — The function to use to form a batch from a list of elements of the calibration dataset. - remove_unused_columns (
bool
, defaults toTrue
) — Whether or not to remove the columns unused by the model forward method.
Quantize a model given the optimization specifications defined in quantization_config
.
INCTrainer
class optimum.intel.INCTrainer
< source >( model: typing.Union[transformers.modeling_utils.PreTrainedModel, torch.nn.modules.module.Module] = None args: TrainingArguments = None data_collator: typing.Optional[DataCollator] = None train_dataset: typing.Optional[torch.utils.data.dataset.Dataset] = None eval_dataset: typing.Optional[torch.utils.data.dataset.Dataset] = None tokenizer: typing.Optional[transformers.tokenization_utils_base.PreTrainedTokenizerBase] = None model_init: typing.Callable[[], transformers.modeling_utils.PreTrainedModel] = None compute_metrics: typing.Union[typing.Callable[[transformers.trainer_utils.EvalPrediction], typing.Dict], NoneType] = None callbacks: typing.Optional[typing.List[transformers.trainer_callback.TrainerCallback]] = None optimizers: typing.Tuple[torch.optim.optimizer.Optimizer, torch.optim.lr_scheduler.LambdaLR] = (None, None) preprocess_logits_for_metrics: typing.Callable[[torch.Tensor, torch.Tensor], torch.Tensor] = None quantization_config: typing.Optional[neural_compressor.config._BaseQuantizationConfig] = None pruning_config: typing.Optional[neural_compressor.config._BaseQuantizationConfig] = None distillation_config: typing.Optional[neural_compressor.config._BaseQuantizationConfig] = None task: typing.Optional[str] = None )
INCTrainer enables Intel Neural Compression quantization aware training, pruning and distillation.
How the distillation loss is computed given the student and teacher outputs.
How the loss is computed by Trainer. By default, all models return the loss in the first element.
Will save the model, so you can reload it using from_pretrained()
.
Will only save from the main process.
INCModel
class optimum.intel.INCModel
< source >( model config: PretrainedConfig = None model_save_dir: typing.Union[str, pathlib.Path, tempfile.TemporaryDirectory, NoneType] = None q_config: typing.Dict = None inc_config: typing.Dict = None **kwargs )
INCModelForSequenceClassification
class optimum.intel.INCModelForSequenceClassification
< source >( model config: PretrainedConfig = None model_save_dir: typing.Union[str, pathlib.Path, tempfile.TemporaryDirectory, NoneType] = None q_config: typing.Dict = None inc_config: typing.Dict = None **kwargs )
INCModelForQuestionAnswering
class optimum.intel.INCModelForQuestionAnswering
< source >( model config: PretrainedConfig = None model_save_dir: typing.Union[str, pathlib.Path, tempfile.TemporaryDirectory, NoneType] = None q_config: typing.Dict = None inc_config: typing.Dict = None **kwargs )
INCModelForTokenClassification
class optimum.intel.INCModelForTokenClassification
< source >( model config: PretrainedConfig = None model_save_dir: typing.Union[str, pathlib.Path, tempfile.TemporaryDirectory, NoneType] = None q_config: typing.Dict = None inc_config: typing.Dict = None **kwargs )
INCModelForMultipleChoice
class optimum.intel.INCModelForMultipleChoice
< source >( model config: PretrainedConfig = None model_save_dir: typing.Union[str, pathlib.Path, tempfile.TemporaryDirectory, NoneType] = None q_config: typing.Dict = None inc_config: typing.Dict = None **kwargs )
INCModelForMaskedLM
class optimum.intel.INCModelForMaskedLM
< source >( model config: PretrainedConfig = None model_save_dir: typing.Union[str, pathlib.Path, tempfile.TemporaryDirectory, NoneType] = None q_config: typing.Dict = None inc_config: typing.Dict = None **kwargs )
INCModelForCausalLM
class optimum.intel.INCModelForCausalLM
< source >( model config: PretrainedConfig = None model_save_dir: typing.Union[str, pathlib.Path, tempfile.TemporaryDirectory, NoneType] = None q_config: typing.Dict = None inc_config: typing.Dict = None use_cache: bool = True **kwargs )
INCModelForSeq2SeqLM
class optimum.intel.INCModelForSeq2SeqLM
< source >( model config: PretrainedConfig = None model_save_dir: typing.Union[str, pathlib.Path, tempfile.TemporaryDirectory, NoneType] = None q_config: typing.Dict = None inc_config: typing.Dict = None **kwargs )