Tracking
There are a large number of experiment tracking API’s available, however getting them all to work with in a multi-processing environment can oftentimes be complex. 🤗 Accelerate provides a general tracking API that can be used to log useful items during your script through log()
Integrated Trackers
Currently Accelerate
supports three trackers out-of-the-box:
class accelerate.tracking.TensorBoardTracker
< source >( run_name: str logging_dir: typing.Union[str, os.PathLike, NoneType] )
A Tracker
class that supports tensorboard
. Should be initialized at the start of your script.
Closes TensorBoard
writer
log
< source >( values: dict step: typing.Optional[int] = None )
Logs values
to the current run.
store_init_configuration
< source >( values: dict )
Logs values
as hyperparameters for the run. Should be run at the beginning of your experiment.
class accelerate.tracking.WandBTracker
< source >( run_name: str )
A Tracker
class that supports wandb
. Should be initialized at the start of your script.
Closes wandb
writer
log
< source >( values: dict step: typing.Optional[int] = None )
Logs values
to the current run.
store_init_configuration
< source >( values: dict )
Logs values
as hyperparameters for the run. Should be run at the beginning of your experiment.
class accelerate.tracking.CometMLTracker
< source >( run_name: str )
A Tracker
class that supports comet_ml
. Should be initialized at the start of your script.
API keys must be stored in a Comet config file.
Closes comet-ml
writer
log
< source >( values: dict step: typing.Optional[int] = None )
Logs values
to the current run.
store_init_configuration
< source >( values: dict )
Logs values
as hyperparameters for the run. Should be run at the beginning of your experiment.
To use any of them, pass in the selected type(s) to the log_with
parameter in Accelerate
:
from accelerate import Accelerator
from accelerate.utils import LoggerType
accelerator = Accelerator(log_with="all") # For all available trackers in the environment
accelerator = Accelerator(log_with="wandb")
accelerator = Accelerator(log_with=["wandb", LoggerType.TENSORBOARD])
At the start of your experiment init_trackers() should be used to setup your project, and potentially add any experiment hyperparameters to be logged:
hps = {"num_iterations": 5, "learning_rate": 1e-2}
accelerator.init_trackers("my_project", config=hps)
When you are ready to log any data, log() should be used.
A step
can also be passed in to correlate the data with a particular step in the training loop.
accelerator.log({"train_loss": 1.12, "valid_loss": 0.8}, step=1)
Once you’ve finished training, make sure to run end_training() so that all the trackers can run their finish functionalities if they have any.
accelerator.end_training()
A full example is below:
from accelerate import Accelerator
accelerator = Accelerator(log_with="all")
config = {
"num_iterations": 5,
"learning_rate": 1e-2,
"loss_function": str(my_loss_function),
}
accelerator.init_trackers("example_project", config=config)
my_model, my_optimizer, my_training_dataloader = accelerate.prepare(my_model, my_optimizer, my_training_dataloader)
device = accelerator.device
my_model.to(device)
for iteration in config["num_iterations"]:
for step, batch in my_training_dataloader:
my_optimizer.zero_grad()
inputs, targets = batch
inputs = inputs.to(device)
targets = targets.to(device)
outputs = my_model(inputs)
loss = my_loss_function(outputs, targets)
accelerator.backward(loss)
my_optimizer.step()
accelerator.log({"training_loss": loss}, step=step)
accelerator.end_training()
Implementing Custom Trackers
To implement a new tracker to be used in Accelerator
, a new one can be made through implementing the ~GeneralTracker
class.
Every tracker must implement three functions:
__init__
:- Should store a
run_name
and initialize the tracker API of the integrated library. - If a tracker stores their data locally (such as TensorBoard), a
logging_dir
parameter can be added.
- Should store a
store_init_configuration
:- Should take in a
values
dictionary and store them as a one-time experiment configuration
- Should take in a
log
:- Should take in a
values
dictionary and astep
, and should log them to the run
- Should take in a
A brief example can be seen below with an integration with Weights and Biases, containing only the relevent information:
from accelerate.tracking import GeneralTracker
from typing import Optional
import wandb
class MyCustomTracker(GeneralTracker):
def __init__(self, run_name: str):
self.run_name = run_name
wandb.init(self.run_name)
def store_init_configuration(self, values: dict):
wandb.config(values)
def log(self, values: dict, step: Optional[int] = None):
wandb.log(values, step=step)
When you are ready to build your Accelerator
object, pass in an instance of your tracker to log_with
to have it automatically
be used with the API:
tracker = MyCustomTracker("some_run_name")
accelerator = Accelerator(log_with=tracker)
These also can be mixed with existing trackers, including with "all"
:
tracker = MyCustomTracker("some_run_name")
accelerator = Accelerator(log_with=[tracker, "all"])
When a wrapper cannot work
If a library has an API that does not follow a strict .log
with an overall dictionary such as Neptune.AI, logging can be done manually under an if accelerator.is_main_process
statement:
from accelerate import Accelerator
+ import neptune.new as neptune
accelerator = Accelerator()
+ run = neptune.init(...)
my_model, my_optimizer, my_training_dataloader = accelerate.prepare(my_model, my_optimizer, my_training_dataloader)
device = accelerator.device
my_model.to(device)
for iteration in config["num_iterations"]:
for batch in my_training_dataloader:
my_optimizer.zero_grad()
inputs, targets = batch
inputs = inputs.to(device)
targets = targets.to(device)
outputs = my_model(inputs)
loss = my_loss_function(outputs, targets)
total_loss += loss
accelerator.backward(loss)
my_optimizer.step()
+ if accelerator.is_main_process:
+ run["logs/training/batch/loss"].log(loss)