Optimum documentation

Reference

Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

Reference

IncOptimizer

class optimum.intel.neural_compressor.IncOptimizer

< >

( model: PreTrainedModel quantizer: typing.Optional[optimum.intel.neural_compressor.quantization.IncQuantizer] = None pruner: typing.Optional[optimum.intel.neural_compressor.pruning.IncPruner] = None distiller: typing.Optional[optimum.intel.neural_compressor.distillation.IncDistiller] = None one_shot_optimization: bool = True eval_func: typing.Optional[typing.Callable] = None train_func: typing.Optional[typing.Callable] = None )

save_pretrained

< >

( save_directory: typing.Union[str, os.PathLike, NoneType] = None )

Parameters

  • save_directory (str or os.PathLike) — Directory to which to save. Will be created if it doesn’t exist.

Save the optimized model as well as its corresponding configuration to a directory, so that it can be re-loaded.

IncPruner

class optimum.intel.neural_compressor.IncPruner

< >

( config: typing.Union[str, optimum.intel.neural_compressor.configuration.IncPruningConfig] eval_func: typing.Optional[typing.Callable] train_func: typing.Optional[typing.Callable] )

IncDistiller

class optimum.intel.neural_compressor.IncDistiller

< >

( config: typing.Union[str, optimum.intel.neural_compressor.configuration.IncDistillationConfig] teacher_model: PreTrainedModel eval_func: typing.Optional[typing.Callable] train_func: typing.Optional[typing.Callable] )

IncQuantizer

class optimum.intel.neural_compressor.IncQuantizer

< >

( config: typing.Union[str, optimum.intel.neural_compressor.configuration.IncQuantizationConfig] eval_func: typing.Optional[typing.Callable] train_func: typing.Optional[typing.Callable] = None calib_dataloader: typing.Optional[torch.utils.data.dataloader.DataLoader] = None calib_func: typing.Optional[typing.Callable] = None )

IncTrainer

class optimum.intel.neural_compressor.IncTrainer

< >

( model: typing.Union[transformers.modeling_utils.PreTrainedModel, torch.nn.modules.module.Module] = None args: TrainingArguments = None data_collator: typing.Optional[DataCollator] = None train_dataset: typing.Optional[torch.utils.data.dataset.Dataset] = None eval_dataset: typing.Optional[torch.utils.data.dataset.Dataset] = None tokenizer: typing.Optional[transformers.tokenization_utils_base.PreTrainedTokenizerBase] = None model_init: typing.Callable[[], transformers.modeling_utils.PreTrainedModel] = None compute_metrics: typing.Union[typing.Callable[[transformers.trainer_utils.EvalPrediction], typing.Dict], NoneType] = None callbacks: typing.Optional[typing.List[transformers.trainer_callback.TrainerCallback]] = None optimizers: typing.Tuple[torch.optim.optimizer.Optimizer, torch.optim.lr_scheduler.LambdaLR] = (None, None) preprocess_logits_for_metrics: typing.Callable[[torch.Tensor, torch.Tensor], torch.Tensor] = None )

compute_distillation_loss

< >

( student_outputs teacher_outputs )

How the distillation loss is computed given the student and teacher outputs.

compute_loss

< >

( model inputs return_outputs = False )

How the loss is computed. By default, all models return the loss in the first element.

save_model

< >

( output_dir: typing.Optional[str] = None _internal_call: bool = False )

Will save the model, so you can reload it using from_pretrained(). Will only save from the main process.

train

< >

( agent: typing.Optional[neural_compressor.experimental.component.Component] = None resume_from_checkpoint: typing.Union[str, bool, NoneType] = None trial: typing.Union[ForwardRef('optuna.Trial'), typing.Dict[str, typing.Any]] = None ignore_keys_for_eval: typing.Optional[typing.List[str]] = None **kwargs )

Parameters

  • agent (Component, optional) — Component object containing the compression objects to apply during the training process.
  • resume_from_checkpoint (str or bool, optional) — If a str, local path to a saved checkpoint as saved by a previous instance of [IncTrainer]. If a bool and equals True, load the last checkpoint in args.output_dir as saved by a previous instance of [IncTrainer]. If present, training will resume from the model/optimizer/scheduler states loaded here.
  • trial (optuna.Trial or Dict[str, Any], optional) — The trial run or the hyperparameter dictionary for hyperparameter search.
  • ignore_keys_for_eval (List[str], optional) — A list of keys in the output of your model (if it is a dictionary) that should be ignored when gathering predictions for evaluation during the training. kwargs — Additional keyword arguments used to hide deprecated arguments

Main training entry point.

IncQuantizedModel

class optimum.intel.neural_compressor.quantization.IncQuantizedModel

< >

( *args **kwargs )

from_pretrained

< >

( model_name_or_path: str inc_config: typing.Union[optimum.intel.neural_compressor.configuration.IncOptimizedConfig, str] = None q_model_name: typing.Optional[str] = None **kwargs ) → q_model

Parameters

  • model_name_or_path (str) — Repository name in the Hugging Face Hub or path to a local directory hosting the model.
  • inc_config (Union[IncOptimizedConfig, str], optional) — Configuration file containing all the information related to the model quantization. Can be either:
    • an instance of the class IncOptimizedConfig,
    • a string valid as input to IncOptimizedConfig.from_pretrained.
  • q_model_name (str, optional) — Name of the state dictionary located in model_name_or_path used to load the quantized model. If state_dict is specified, the latter will not be used.
  • cache_dir (str, optional) — Path to a directory in which a downloaded configuration should be cached if the standard cache should not be used.
  • force_download (bool, optional, defaults to False) — Whether or not to force to (re-)download the configuration files and override the cached versions if they exist.
  • resume_download (bool, optional, defaults to False) — Whether or not to delete incompletely received file. Attempts to resume the download if such a file exists.
  • revision(str, optional) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
  • state_dict (Dict[str, torch.Tensor], optional) — State dictionary of the quantized model, if not specified q_model_name will be used to load the state dictionary.

Returns

q_model

Quantized model.

Instantiate a quantized pytorch model from a given Intel Neural Compressor configuration file.

IncQuantizedModelForSequenceClassification

class optimum.intel.neural_compressor.IncQuantizedModelForSequenceClassification

< >

( *args **kwargs )

IncQuantizedModelForQuestionAnswering

class optimum.intel.neural_compressor.IncQuantizedModelForQuestionAnswering

< >

( *args **kwargs )

IncQuantizedModelForTokenClassification

class optimum.intel.neural_compressor.IncQuantizedModelForTokenClassification

< >

( *args **kwargs )

IncQuantizedModelForMultipleChoice

class optimum.intel.neural_compressor.IncQuantizedModelForMultipleChoice

< >

( *args **kwargs )

IncQuantizedModelForMaskedLM

class optimum.intel.neural_compressor.IncQuantizedModelForMaskedLM

< >

( *args **kwargs )

IncQuantizedModelForCausalLM

class optimum.intel.neural_compressor.IncQuantizedModelForCausalLM

< >

( *args **kwargs )

IncQuantizedModelForSeq2SeqLM

class optimum.intel.neural_compressor.IncQuantizedModelForSeq2SeqLM

< >

( *args **kwargs )