Transformers documentation

Callbacks

Transformers

You are viewing v4.16.2 version. A newer version v4.57.1 is available.

Join the Hugging Face community

and get access to the augmented documentation experience

Collaborate on models, datasets and Spaces

Faster examples with accelerated inference

Switch between documentation themes

to get started

Callbacks

Callbacks are objects that can customize the behavior of the training loop in the PyTorch Trainer (this feature is not yet implemented in TensorFlow) that can inspect the training loop state (for progress reporting, logging on TensorBoard or other ML platforms…) and take decisions (like early stopping).

Callbacks are “read only” pieces of code, apart from the TrainerControl object they return, they cannot change anything in the training loop. For customizations that require changes in the training loop, you should subclass Trainer and override the methods you need (see trainer for examples).

By default a Trainer will use the following callbacks:

DefaultFlowCallback which handles the default behavior for logging, saving and evaluation.
PrinterCallback or ProgressCallback to display progress and print the logs (the first one is used if you deactivate tqdm through the TrainingArguments, otherwise it’s the second one).
TensorBoardCallback if tensorboard is accessible (either through PyTorch >= 1.4 or tensorboardX).
WandbCallback if wandb is installed.
CometCallback if comet_ml is installed.
MLflowCallback if mlflow is installed.
AzureMLCallback if azureml-sdk is installed.

The main class that implements callbacks is TrainerCallback. It gets the TrainingArguments used to instantiate the Trainer, can access that Trainer’s internal state via TrainerState, and can take some actions on the training loop via TrainerControl.

Available Callbacks

Here is the list of the available TrainerCallback in the library:

class transformers.integrations.CometCallback < source >

( )

A TrainerCallback that sends the logs to Comet ML.

setup < source >

( args state model )

Setup the optional Comet.ml integration.

Environment: COMET_MODE (str, optional): Whether to create an online, offline experiment or disable Comet logging. Can be “OFFLINE”, “ONLINE”, or “DISABLED”. Defaults to “ONLINE”. COMET_PROJECT_NAME (str, optional): Comet project name for experiments COMET_OFFLINE_DIRECTORY (str, optional): Folder to use for saving offline experiments when COMET_MODE is “OFFLINE” COMET_LOG_ASSETS (str, optional): Whether or not to log training assets (tf event logs, checkpoints, etc), to Comet. Can be “TRUE”, or “FALSE”. Defaults to “TRUE”.

For a number of configurable items in the environment, see here.

class transformers.DefaultFlowCallback < source >

( )

A TrainerCallback that handles the default flow of the training loop for logs, evaluation and checkpoints.

class transformers.PrinterCallback < source >

( )

A bare TrainerCallback that just prints the logs.

class transformers.ProgressCallback < source >

( )

A TrainerCallback that displays the progress of training or evaluation.

class transformers.EarlyStoppingCallback < source >

( early_stopping_patience: int = 1 early_stopping_threshold: typing.Optional[float] = 0.0 )

Parameters

early_stopping_patience (int) — Use with metric_for_best_model to stop training when the specified metric worsens for early_stopping_patience evaluation calls.
early_stopping_threshold(float, optional) — Use with TrainingArguments metric_for_best_model and early_stopping_patience to denote how much the specified metric must improve to satisfy early stopping conditions. `

A TrainerCallback that handles early stopping.

This callback depends on TrainingArguments argument load_best_model_at_end functionality to set best_metric in TrainerState.

class transformers.integrations.TensorBoardCallback < source >

( tb_writer = None )

Parameters

tb_writer (SummaryWriter, optional) — The writer to use. Will instantiate one if not set.

A TrainerCallback that sends the logs to TensorBoard.

class transformers.integrations.WandbCallback < source >

( )

A TrainerCallback that sends the logs to Weight and Biases.

setup < source >

( args state model **kwargs )

Setup the optional Weights & Biases (wandb) integration.

One can subclass and override this method to customize the setup if needed. Find more information here. You can also override the following environment variables:

Environment: WANDB_LOG_MODEL (bool, optional, defaults to False): Whether or not to log model as artifact at the end of training. Use along with TrainingArguments.load_best_model_at_end to upload best model. WANDB_WATCH (str, optional defaults to "gradients"): Can be "gradients", "all" or "false". Set to "false" to disable gradient logging or "all" to log gradients and parameters. WANDB_PROJECT (str, optional, defaults to "huggingface"): Set this to a custom string to store results in a different project. WANDB_DISABLED (bool, optional, defaults to False): Whether or not to disable wandb entirely. Set WANDB_DISABLED=true to disable.

class transformers.integrations.MLflowCallback < source >

( )

A TrainerCallback that sends the logs to MLflow.

setup < source >

( args state model )

Setup the optional MLflow integration.

Environment: HF_MLFLOW_LOG_ARTIFACTS (str, optional): Whether to use MLflow .log_artifact() facility to log artifacts.

This only makes sense if logging to a remote server, e.g. s3 or GCS. If set to True or 1, will copy whatever is in TrainingArguments’s output_dir to the local or remote artifact storage. Using it without a remote storage will just copy the files to your artifact location.

class transformers.integrations.AzureMLCallback < source >

( azureml_run = None )

A TrainerCallback that sends the logs to AzureML.

TrainerCallback

class transformers.TrainerCallback < source >

( )

Parameters

args (TrainingArguments) — The training arguments used to instantiate the Trainer.
state (TrainerState) — The current state of the Trainer.
control (TrainerControl) — The object that is returned to the Trainer and can be used to make some decisions.
model (PreTrainedModel or torch.nn.Module) — The model being trained.
tokenizer (PreTrainedTokenizer) — The tokenizer used for encoding the data.
optimizer (torch.optim.Optimizer) — The optimizer used for the training steps.
lr_scheduler (torch.optim.lr_scheduler.LambdaLR) — The scheduler used for setting the learning rate.
train_dataloader (torch.utils.data.DataLoader, optional) — The current dataloader used for training.
eval_dataloader (torch.utils.data.DataLoader, optional) — The current dataloader used for training.
metrics (Dict[str, float]) — The metrics computed by the last evaluation phase.

Those are only accessible in the event on_evaluate.
logs (Dict[str, float]) — The values to log.

Those are only accessible in the event on_log.

A class for objects that will inspect the state of the training loop at some events and take some decisions. At each of those events the following arguments are available:

The control object is the only one that can be changed by the callback, in which case the event that changes it should return the modified version.

The argument args, state and control are positionals for all events, all the others are grouped in kwargs. You can unpack the ones you need in the signature of the event using them. As an example, see the code of the simple PrinterCallback.

Example:

class PrinterCallback(TrainerCallback):
    def on_log(self, args, state, control, logs=None, **kwargs):
        _ = logs.pop("total_flos", None)
        if state.is_local_process_zero:
            print(logs)

on_epoch_begin < source >