mmaction2 / docs /en /user_guides /config.md

mmaction2

d3dbf03 verified 3 months ago

43.2 kB

	# Learn about Configs

	We use python files as configs, incorporate modular and inheritance design into our config system, which is convenient to conduct various experiments.
	You can find all the provided configs under `$MMAction2/configs`. If you wish to inspect the config file,
	you may run `python tools/analysis_tools/print_config.py /PATH/TO/CONFIG` to see the complete config.

	<!-- TOC -->

	- [Learn about Configs](#learn-about-configs)
	- [Modify config through script arguments](#modify-config-through-script-arguments)
	- [Config File Structure](#config-file-structure)
	- [Config File Naming Convention](#config-file-naming-convention)
	- [Config System for Action Recognition](#config-system-for-action-recognition)
	- [Config System for Spatio-Temporal Action Detection](#config-system-for-spatio-temporal-action-detection)
	- [Config System for Action localization](#config-system-for-action-localization)

	<!-- TOC -->

	## Modify config through script arguments

	When submitting jobs using `tools/train.py` or `tools/test.py`, you may specify `--cfg-options` to in-place modify the config.

	- Update config keys of dict.

	The config options can be specified following the order of the dict keys in the original config.
	For example, `--cfg-options model.backbone.norm_eval=False` changes the all BN modules in model backbones to `train` mode.

	- Update keys inside a list of configs.

	Some config dicts are composed as a list in your config. For example, the training pipeline `train_pipeline` is normally a list
	e.g. `[dict(type='SampleFrames'), ...]`. If you want to change `'SampleFrames'` to `'DenseSampleFrames'` in the pipeline,
	you may specify `--cfg-options train_pipeline.0.type=DenseSampleFrames`.

	- Update values of list/tuples.

	If the value to be updated is a list or a tuple. For example, the config file normally sets `model.data_preprocessor.mean=[123.675, 116.28, 103.53]`. If you want to
	change this key, you may specify `--cfg-options model.data_preprocessor.mean="[128,128,128]"`. Note that the quotation mark " is necessary to support list/tuple data types.

	## Config File Structure

	There are 3 basic component types under `configs/_base_`, models, schedules, default_runtime.
	Many methods could be easily constructed with one of each like TSN, I3D, SlowOnly, etc.
	The configs that are composed by components from `_base_` are called _primitive_.

	For all configs under the same folder, it is recommended to have only one _primitive_ config. All other configs should inherit from the _primitive_ config. In this way, the maximum of inheritance level is 3.

	For easy understanding, we recommend contributors to inherit from exiting methods.
	For example, if some modification is made based on TSN, users may first inherit the basic TSN structure by specifying `_base_ = ../tsn/tsn_imagenet-pretrained-r50_8xb32-1x1x3-100e_kinetics400-rgb.py`, then modify the necessary fields in the config files.

	If you are building an entirely new method that does not share the structure with any of the existing methods, you may create a folder under `configs/TASK`.

	Please refer to [mmengine](https://mmengine.readthedocs.io/en/latest/tutorials/config.html) for detailed documentation.

	## Config File Naming Convention

	We follow the style below to name config files. Contributors are advised to follow the same style. The config file names are divided into several parts. Logically, different parts are concatenated by underscores `'_'`, and settings in the same part are concatenated by dashes `'-'`.

	```
	{algorithm info}_{module info}_{training info}_{data info}.py
	```

	`{xxx}` is required field and `[yyy]` is optional.

	- `{algorithm info}`:
	- `{model}`: model type, e.g. `tsn`, `i3d`, `swin`, `vit`, etc.
	- `[model setting]`: specific setting for some models, e.g. `base`, `p16`, `w877`, etc.
	- `{module info}`:
	- `[pretained info]`: pretrained information, e.g. `kinetics400-pretrained`, `in1k-pre`, etc.
	- `{backbone}`: backbone type. e.g. `r50` (ResNet-50), etc.
	- `[backbone setting]`: specific setting for some backbones, e.g. `nl-dot-product`, `bnfrozen`, `nopool`, etc.
	- `{training info}`:
	- `{gpu x batch_per_gpu]}`: GPUs and samples per GPU.
	- `{pipeline setting}`: frame sample setting, e.g. `dense`, `{clip_len}x{frame_interval}x{num_clips}`, `u48`, etc.
	- `{schedule}`: training schedule, e.g. `coslr-20e`.
	- `{data info}`:
	- `{dataset}`: dataset name, e.g. `kinetics400`, `mmit`, etc.
	- `{modality}`: data modality, e.g. `rgb`, `flow`, `keypoint-2d`, etc.

	### Config System for Action Recognition

	We incorporate modular design into our config system,
	which is convenient to conduct various experiments.

	- An Example of TSN

	To help the users have a basic idea of a complete config structure and the modules in an action recognition system,
	we make brief comments on the config of TSN as the following.
	For more detailed usage and alternative for per parameter in each module, please refer to the API documentation.

	```python
	# model settings
	model = dict( # Config of the model
	type='Recognizer2D', # Class name of the recognizer
	backbone=dict( # Dict for backbone
	type='ResNet', # Name of the backbone
	pretrained='torchvision://resnet50', # The url/site of the pretrained model
	depth=50, # Depth of ResNet model
	norm_eval=False), # Whether to set BN layers to eval mode when training
	cls_head=dict( # Dict for classification head
	type='TSNHead', # Name of classification head
	num_classes=400, # Number of classes to be classified.
	in_channels=2048, # The input channels of classification head.
	spatial_type='avg', # Type of pooling in spatial dimension
	consensus=dict(type='AvgConsensus', dim=1), # Config of consensus module
	dropout_ratio=0.4, # Probability in dropout layer
	init_std=0.01, # Std value for linear layer initiation
	average_clips='prob'), # Method to average multiple clip results
	data_preprocessor=dict( # Dict for data preprocessor
	type='ActionDataPreprocessor', # Name of data preprocessor
	mean=[123.675, 116.28, 103.53], # Mean values of different channels to normalize
	std=[58.395, 57.12, 57.375], # Std values of different channels to normalize
	format_shape='NCHW'), # Final image shape format
	# model training and testing settings
	train_cfg=None, # Config of training hyperparameters for TSN
	test_cfg=None) # Config for testing hyperparameters for TSN.

	# dataset settings
	dataset_type = 'RawframeDataset' # Type of dataset for training, validation and testing
	data_root = 'data/kinetics400/rawframes_train/' # Root path to data for training
	data_root_val = 'data/kinetics400/rawframes_val/' # Root path to data for validation and testing
	ann_file_train = 'data/kinetics400/kinetics400_train_list_rawframes.txt' # Path to the annotation file for training
	ann_file_val = 'data/kinetics400/kinetics400_val_list_rawframes.txt' # Path to the annotation file for validation
	ann_file_test = 'data/kinetics400/kinetics400_val_list_rawframes.txt' # Path to the annotation file for testing

	train_pipeline = [ # Training data processing pipeline
	dict( # Config of SampleFrames
	type='SampleFrames', # Sample frames pipeline, sampling frames from video
	clip_len=1, # Frames of each sampled output clip
	frame_interval=1, # Temporal interval of adjacent sampled frames
	num_clips=3), # Number of clips to be sampled
	dict( # Config of RawFrameDecode
	type='RawFrameDecode'), # Load and decode Frames pipeline, picking raw frames with given indices
	dict( # Config of Resize
	type='Resize', # Resize pipeline
	scale=(-1, 256)), # The scale to resize images
	dict( # Config of MultiScaleCrop
	type='MultiScaleCrop', # Multi scale crop pipeline, cropping images with a list of randomly selected scales
	input_size=224, # Input size of the network
	scales=(1, 0.875, 0.75, 0.66), # Scales of width and height to be selected
	random_crop=False, # Whether to randomly sample cropping bbox
	max_wh_scale_gap=1), # Maximum gap of w and h scale levels
	dict( # Config of Resize
	type='Resize', # Resize pipeline
	scale=(224, 224), # The scale to resize images
	keep_ratio=False), # Whether to resize with changing the aspect ratio
	dict( # Config of Flip
	type='Flip', # Flip Pipeline
	flip_ratio=0.5), # Probability of implementing flip
	dict( # Config of FormatShape
	type='FormatShape', # Format shape pipeline, Format final image shape to the given input_format
	input_format='NCHW'), # Final image shape format
	dict(type='PackActionInputs') # Config of PackActionInputs
	]
	val_pipeline = [ # Validation data processing pipeline
	dict( # Config of SampleFrames
	type='SampleFrames', # Sample frames pipeline, sampling frames from video
	clip_len=1, # Frames of each sampled output clip
	frame_interval=1, # Temporal interval of adjacent sampled frames
	num_clips=3, # Number of clips to be sampled
	test_mode=True), # Whether to set test mode in sampling
	dict( # Config of RawFrameDecode
	type='RawFrameDecode'), # Load and decode Frames pipeline, picking raw frames with given indices
	dict( # Config of Resize
	type='Resize', # Resize pipeline
	scale=(-1, 256)), # The scale to resize images
	dict( # Config of CenterCrop
	type='CenterCrop', # Center crop pipeline, cropping the center area from images
	crop_size=224), # The size to crop images
	dict( # Config of Flip
	type='Flip', # Flip pipeline
	flip_ratio=0), # Probability of implementing flip
	dict( # Config of FormatShape
	type='FormatShape', # Format shape pipeline, Format final image shape to the given input_format
	input_format='NCHW'), # Final image shape format
	dict(type='PackActionInputs') # Config of PackActionInputs
	]
	test_pipeline = [ # Testing data processing pipeline
	dict( # Config of SampleFrames
	type='SampleFrames', # Sample frames pipeline, sampling frames from video
	clip_len=1, # Frames of each sampled output clip
	frame_interval=1, # Temporal interval of adjacent sampled frames
	num_clips=25, # Number of clips to be sampled
	test_mode=True), # Whether to set test mode in sampling
	dict( # Config of RawFrameDecode
	type='RawFrameDecode'), # Load and decode Frames pipeline, picking raw frames with given indices
	dict( # Config of Resize
	type='Resize', # Resize pipeline
	scale=(-1, 256)), # The scale to resize images
	dict( # Config of TenCrop
	type='TenCrop', # Ten crop pipeline, cropping ten area from images
	crop_size=224), # The size to crop images
	dict( # Config of Flip
	type='Flip', # Flip pipeline
	flip_ratio=0), # Probability of implementing flip
	dict( # Config of FormatShape
	type='FormatShape', # Format shape pipeline, Format final image shape to the given input_format
	input_format='NCHW'), # Final image shape format
	dict(type='PackActionInputs') # Config of PackActionInputs
	]

	train_dataloader = dict( # Config of train dataloader
	batch_size=32, # Batch size of each single GPU during training
	num_workers=8, # Workers to pre-fetch data for each single GPU during training
	persistent_workers=True, # If `True`, the dataloader will not shut down the worker processes after an epoch end, which can accelerate training speed
	sampler=dict(
	type='DefaultSampler', # DefaultSampler which supports both distributed and non-distributed training. Refer to https://github.com/open-mmlab/mmengine/blob/main/mmengine/dataset/sampler.py
	shuffle=True), # Randomly shuffle the training data in each epoch
	dataset=dict( # Config of train dataset
	type=dataset_type,
	ann_file=ann_file_train, # Path of annotation file
	data_prefix=dict(img=data_root), # Prefix of frame path
	pipeline=train_pipeline))
	val_dataloader = dict( # Config of validation dataloader
	batch_size=1, # Batch size of each single GPU during validation
	num_workers=8, # Workers to pre-fetch data for each single GPU during validation
	persistent_workers=True, # If `True`, the dataloader will not shut down the worker processes after an epoch end
	sampler=dict(
	type='DefaultSampler',
	shuffle=False), # Not shuffle during validation and testing
	dataset=dict( # Config of validation dataset
	type=dataset_type,
	ann_file=ann_file_val, # Path of annotation file
	data_prefix=dict(img=data_root_val), # Prefix of frame path
	pipeline=val_pipeline,
	test_mode=True))
	test_dataloader = dict( # Config of test dataloader
	batch_size=32, # Batch size of each single GPU during testing
	num_workers=8, # Workers to pre-fetch data for each single GPU during testing
	persistent_workers=True, # If `True`, the dataloader will not shut down the worker processes after an epoch end
	sampler=dict(
	type='DefaultSampler',
	shuffle=False), # Not shuffle during validation and testing
	dataset=dict( # Config of test dataset
	type=dataset_type,
	ann_file=ann_file_val, # Path of annotation file
	data_prefix=dict(img=data_root_val), # Prefix of frame path
	pipeline=test_pipeline,
	test_mode=True))

	# evaluation settings
	val_evaluator = dict(type='AccMetric') # Config of validation evaluator
	test_evaluator = val_evaluator # Config of testing evaluator

	train_cfg = dict( # Config of training loop
	type='EpochBasedTrainLoop', # Name of training loop
	max_epochs=100, # Total training epochs
	val_begin=1, # The epoch that begins validating
	val_interval=1) # Validation interval
	val_cfg = dict( # Config of validation loop
	type='ValLoop') # Name of validation loop
	test_cfg = dict( # Config of testing loop
	type='TestLoop') # Name of testing loop

	# learning policy
	param_scheduler = [ # Parameter scheduler for updating optimizer parameters, support dict or list
	dict(type='MultiStepLR', # Decays the learning rate once the number of epoch reaches one of the milestones
	begin=0, # Step at which to start updating the learning rate
	end=100, # Step at which to stop updating the learning rate
	by_epoch=True, # Whether the scheduled learning rate is updated by epochs
	milestones=[40, 80], # Steps to decay the learning rate
	gamma=0.1)] # Multiplicative factor of learning rate decay

	# optimizer
	optim_wrapper = dict( # Config of optimizer wrapper
	type='OptimWrapper', # Name of optimizer wrapper, switch to AmpOptimWrapper to enable mixed precision training
	optimizer=dict( # Config of optimizer. Support all kinds of optimizers in PyTorch. Refer to https://pytorch.org/docs/stable/optim.html#algorithms
	type='SGD', # Name of optimizer
	lr=0.01, # Learning rate
	momentum=0.9, # Momentum factor
	weight_decay=0.0001), # Weight decay
	clip_grad=dict(max_norm=40, norm_type=2)) # Config of gradient clip

	# runtime settings
	default_scope = 'mmaction' # The default registry scope to find modules. Refer to https://mmengine.readthedocs.io/en/latest/tutorials/registry.html
	default_hooks = dict( # Hooks to execute default actions like updating model parameters and saving checkpoints.
	runtime_info=dict(type='RuntimeInfoHook'), # The hook to updates runtime information into message hub
	timer=dict(type='IterTimerHook'), # The logger used to record time spent during iteration
	logger=dict(
	type='LoggerHook', # The logger used to record logs during training/validation/testing phase
	interval=20, # Interval to print the log
	ignore_last=False), # Ignore the log of last iterations in each epoch
	param_scheduler=dict(type='ParamSchedulerHook'), # The hook to update some hyper-parameters in optimizer
	checkpoint=dict(
	type='CheckpointHook', # The hook to save checkpoints periodically
	interval=3, # The saving period
	save_best='auto', # Specified metric to mearsure the best checkpoint during evaluation
	max_keep_ckpts=3), # The maximum checkpoints to keep
	sampler_seed=dict(type='DistSamplerSeedHook'), # Data-loading sampler for distributed training
	sync_buffers=dict(type='SyncBuffersHook')) # Synchronize model buffers at the end of each epoch
	env_cfg = dict( # Dict for setting environment
	cudnn_benchmark=False, # Whether to enable cudnn benchmark
	mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0), # Parameters to setup multiprocessing
	dist_cfg=dict(backend='nccl')) # Parameters to setup distributed environment, the port can also be set

	log_processor = dict(
	type='LogProcessor', # Log processor used to format log information
	window_size=20, # Default smooth interval
	by_epoch=True) # Whether to format logs with epoch type
	vis_backends = [ # List of visualization backends
	dict(type='LocalVisBackend')] # Local visualization backend
	visualizer = dict( # Config of visualizer
	type='ActionVisualizer', # Name of visualizer
	vis_backends=vis_backends)
	log_level = 'INFO' # The level of logging
	load_from = None # Load model checkpoint as a pre-trained model from a given path. This will not resume training.
	resume = False # Whether to resume from the checkpoint defined in `load_from`. If `load_from` is None, it will resume the latest checkpoint in the `work_dir`.
	```

	### Config System for Spatio-Temporal Action Detection

	We incorporate modular design into our config system, which is convenient to conduct various experiments.

	- An Example of FastRCNN

	To help the users have a basic idea of a complete config structure and the modules in a spatio-temporal action detection system,
	we make brief comments on the config of FastRCNN as the following.
	For more detailed usage and alternative for per parameter in each module, please refer to the API documentation.

	```python
	# model setting
	model = dict( # Config of the model
	type='FastRCNN', # Class name of the detector
	_scope_='mmdet', # The scope of current config
	backbone=dict( # Dict for backbone
	type='ResNet3dSlowOnly', # Name of the backbone
	depth=50, # Depth of ResNet model
	pretrained=None, # The url/site of the pretrained model
	pretrained2d=False, # If the pretrained model is 2D
	lateral=False, # If the backbone is with lateral connections
	num_stages=4, # Stages of ResNet model
	conv1_kernel=(1, 7, 7), # Conv1 kernel size
	conv1_stride_t=1, # Conv1 temporal stride
	pool1_stride_t=1, # Pool1 temporal stride
	spatial_strides=(1, 2, 2, 1)), # The spatial stride for each ResNet stage
	roi_head=dict( # Dict for roi_head
	type='AVARoIHead', # Name of the roi_head
	bbox_roi_extractor=dict( # Dict for bbox_roi_extractor
	type='SingleRoIExtractor3D', # Name of the bbox_roi_extractor
	roi_layer_type='RoIAlign', # Type of the RoI op
	output_size=8, # Output feature size of the RoI op
	with_temporal_pool=True), # If temporal dim is pooled
	bbox_head=dict( # Dict for bbox_head
	type='BBoxHeadAVA', # Name of the bbox_head
	in_channels=2048, # Number of channels of the input feature
	num_classes=81, # Number of action classes + 1
	multilabel=True, # If the dataset is multilabel
	dropout_ratio=0.5), # The dropout ratio used
	data_preprocessor=dict( # Dict for data preprocessor
	type='ActionDataPreprocessor', # Name of data preprocessor
	mean=[123.675, 116.28, 103.53], # Mean values of different channels to normalize
	std=[58.395, 57.12, 57.375], # Std values of different channels to normalize
	format_shape='NCHW')), # Final image shape format
	# model training and testing settings
	train_cfg=dict( # Training config of FastRCNN
	rcnn=dict( # Dict for rcnn training config
	assigner=dict( # Dict for assigner
	type='MaxIoUAssignerAVA', # Name of the assigner
	pos_iou_thr=0.9, # IoU threshold for positive examples, > pos_iou_thr -> positive
	neg_iou_thr=0.9, # IoU threshold for negative examples, < neg_iou_thr -> negative
	min_pos_iou=0.9), # Minimum acceptable IoU for positive examples
	sampler=dict( # Dict for sample
	type='RandomSampler', # Name of the sampler
	num=32, # Batch Size of the sampler
	pos_fraction=1, # Positive bbox fraction of the sampler
	neg_pos_ub=-1, # Upper bound of the ratio of num negative to num positive
	add_gt_as_proposals=True), # Add gt bboxes as proposals
	pos_weight=1.0)), # Loss weight of positive examples
	test_cfg=dict(rcnn=None)) # Testing config of FastRCNN

	# dataset settings
	dataset_type = 'AVADataset' # Type of dataset for training, validation and testing
	data_root = 'data/ava/rawframes' # Root path to data
	anno_root = 'data/ava/annotations' # Root path to annotations

	ann_file_train = f'{anno_root}/ava_train_v2.1.csv' # Path to the annotation file for training
	ann_file_val = f'{anno_root}/ava_val_v2.1.csv' # Path to the annotation file for validation

	exclude_file_train = f'{anno_root}/ava_train_excluded_timestamps_v2.1.csv' # Path to the exclude annotation file for training
	exclude_file_val = f'{anno_root}/ava_val_excluded_timestamps_v2.1.csv' # Path to the exclude annotation file for validation

	label_file = f'{anno_root}/ava_action_list_v2.1_for_activitynet_2018.pbtxt' # Path to the label file

	proposal_file_train = f'{anno_root}/ava_dense_proposals_train.FAIR.recall_93.9.pkl' # Path to the human detection proposals for training examples
	proposal_file_val = f'{anno_root}/ava_dense_proposals_val.FAIR.recall_93.9.pkl' # Path to the human detection proposals for validation examples

	train_pipeline = [ # Training data processing pipeline
	dict( # Config of SampleFrames
	type='AVASampleFrames', # Sample frames pipeline, sampling frames from video
	clip_len=4, # Frames of each sampled output clip
	frame_interval=16), # Temporal interval of adjacent sampled frames
	dict( # Config of RawFrameDecode
	type='RawFrameDecode'), # Load and decode Frames pipeline, picking raw frames with given indices
	dict( # Config of RandomRescale
	type='RandomRescale', # Randomly rescale the shortedge by a given range
	scale_range=(256, 320)), # The shortedge size range of RandomRescale
	dict( # Config of RandomCrop
	type='RandomCrop', # Randomly crop a patch with the given size
	size=256), # The size of the cropped patch
	dict( # Config of Flip
	type='Flip', # Flip Pipeline
	flip_ratio=0.5), # Probability of implementing flip
	dict( # Config of FormatShape
	type='FormatShape', # Format shape pipeline, Format final image shape to the given input_format
	input_format='NCTHW', # Final image shape format
	collapse=True), # Collapse the dim N if N == 1
	dict(type='PackActionInputs') # Pack input data
	]

	val_pipeline = [ # Validation data processing pipeline
	dict( # Config of SampleFrames
	type='AVASampleFrames', # Sample frames pipeline, sampling frames from video
	clip_len=4, # Frames of each sampled output clip
	frame_interval=16), # Temporal interval of adjacent sampled frames
	dict( # Config of RawFrameDecode
	type='RawFrameDecode'), # Load and decode Frames pipeline, picking raw frames with given indices
	dict( # Config of Resize
	type='Resize', # Resize pipeline
	scale=(-1, 256)), # The scale to resize images
	dict( # Config of FormatShape
	type='FormatShape', # Format shape pipeline, Format final image shape to the given input_format
	input_format='NCTHW', # Final image shape format
	collapse=True), # Collapse the dim N if N == 1
	dict(type='PackActionInputs') # Pack input data
	]

	train_dataloader = dict( # Config of train dataloader
	batch_size=32, # Batch size of each single GPU during training
	num_workers=8, # Workers to pre-fetch data for each single GPU during training
	persistent_workers=True, # If `True`, the dataloader will not shut down the worker processes after an epoch end, which can accelerate training speed
	sampler=dict(
	type='DefaultSampler', # DefaultSampler which supports both distributed and non-distributed training. Refer to https://github.com/open-mmlab/mmengine/blob/main/mmengine/dataset/sampler.py
	shuffle=True), # Randomly shuffle the training data in each epoch
	dataset=dict( # Config of train dataset
	type=dataset_type,
	ann_file=ann_file_train, # Path of annotation file
	exclude_file=exclude_file_train, # Path of exclude annotation file
	label_file=label_file, # Path of label file
	data_prefix=dict(img=data_root), # Prefix of frame path
	proposal_file=proposal_file_train, # Path of human detection proposals
	pipeline=train_pipeline))
	val_dataloader = dict( # Config of validation dataloader
	batch_size=1, # Batch size of each single GPU during evaluation
	num_workers=8, # Workers to pre-fetch data for each single GPU during evaluation
	persistent_workers=True, # If `True`, the dataloader will not shut down the worker processes after an epoch end
	sampler=dict(
	type='DefaultSampler',
	shuffle=False), # Not shuffle during validation and testing
	dataset=dict( # Config of validation dataset
	type=dataset_type,
	ann_file=ann_file_val, # Path of annotation file
	exclude_file=exclude_file_val, # Path of exclude annotation file
	label_file=label_file, # Path of label file
	data_prefix=dict(img=data_root_val), # Prefix of frame path
	proposal_file=proposal_file_val, # Path of human detection proposals
	pipeline=val_pipeline,
	test_mode=True))
	test_dataloader = val_dataloader # Config of testing dataloader

	# evaluation settings
	val_evaluator = dict( # Config of validation evaluator
	type='AVAMetric',
	ann_file=ann_file_val,
	label_file=label_file,
	exclude_file=exclude_file_val)
	test_evaluator = val_evaluator # Config of testing evaluator

	train_cfg = dict( # Config of training loop
	type='EpochBasedTrainLoop', # Name of training loop
	max_epochs=20, # Total training epochs
	val_begin=1, # The epoch that begins validating
	val_interval=1) # Validation interval
	val_cfg = dict( # Config of validation loop
	type='ValLoop') # Name of validation loop
	test_cfg = dict( # Config of testing loop
	type='TestLoop') # Name of testing loop

	# learning policy
	param_scheduler = [ # Parameter scheduler for updating optimizer parameters, support dict or list
	dict(type='LinearLR', # Decays the learning rate of each parameter group by linearly changing small multiplicative factor
	start_factor=0.1, # The number we multiply learning rate in the first epoch
	by_epoch=True, # Whether the scheduled learning rate is updated by epochs
	begin=0, # Step at which to start updating the learning rate
	end=5), # Step at which to stop updating the learning rate
	dict(type='MultiStepLR', # Decays the learning rate once the number of epoch reaches one of the milestones
	begin=0, # Step at which to start updating the learning rate
	end=20, # Step at which to stop updating the learning rate
	by_epoch=True, # Whether the scheduled learning rate is updated by epochs
	milestones=[10, 15], # Steps to decay the learning rate
	gamma=0.1)] # Multiplicative factor of learning rate decay

	# optimizer
	optim_wrapper = dict( # Config of optimizer wrapper
	type='OptimWrapper', # Name of optimizer wrapper, switch to AmpOptimWrapper to enable mixed precision training
	optimizer=dict( # Config of optimizer. Support all kinds of optimizers in PyTorch. Refer to https://pytorch.org/docs/stable/optim.html#algorithms
	type='SGD', # Name of optimizer
	lr=0.2, # Learning rate
	momentum=0.9, # Momentum factor
	weight_decay=0.0001), # Weight decay
	clip_grad=dict(max_norm=40, norm_type=2)) # Config of gradient clip

	# runtime settings
	default_scope = 'mmaction' # The default registry scope to find modules. Refer to https://mmengine.readthedocs.io/en/latest/tutorials/registry.html
	default_hooks = dict( # Hooks to execute default actions like updating model parameters and saving checkpoints.
	runtime_info=dict(type='RuntimeInfoHook'), # The hook to updates runtime information into message hub
	timer=dict(type='IterTimerHook'), # The logger used to record time spent during iteration
	logger=dict(
	type='LoggerHook', # The logger used to record logs during training/validation/testing phase
	interval=20, # Interval to print the log
	ignore_last=False), # Ignore the log of last iterations in each epoch
	param_scheduler=dict(type='ParamSchedulerHook'), # The hook to update some hyper-parameters in optimizer
	checkpoint=dict(
	type='CheckpointHook', # The hook to save checkpoints periodically
	interval=3, # The saving period
	save_best='auto', # Specified metric to mearsure the best checkpoint during evaluation
	max_keep_ckpts=3), # The maximum checkpoints to keep
	sampler_seed=dict(type='DistSamplerSeedHook'), # Data-loading sampler for distributed training
	sync_buffers=dict(type='SyncBuffersHook')) # Synchronize model buffers at the end of each epoch
	env_cfg = dict( # Dict for setting environment
	cudnn_benchmark=False, # Whether to enable cudnn benchmark
	mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0), # Parameters to setup multiprocessing
	dist_cfg=dict(backend='nccl')) # Parameters to setup distributed environment, the port can also be set

	log_processor = dict(
	type='LogProcessor', # Log processor used to format log information
	window_size=20, # Default smooth interval
	by_epoch=True) # Whether to format logs with epoch type
	vis_backends = [ # List of visualization backends
	dict(type='LocalVisBackend')] # Local visualization backend
	visualizer = dict( # Config of visualizer
	type='ActionVisualizer', # Name of visualizer
	vis_backends=vis_backends)
	log_level = 'INFO' # The level of logging
	load_from = ('https://download.openmmlab.com/mmaction/v1.0/recognition/slowonly/'
	'slowonly_imagenet-pretrained-r50_8xb16-4x16x1-steplr-150e_kinetics400-rgb/'
	'slowonly_imagenet-pretrained-r50_8xb16-4x16x1-steplr-150e_kinetics400-rgb_20220901-e7b65fad.pth') # Load model checkpoint as a pre-trained model from a given path. This will not resume training.
	resume = False # Whether to resume from the checkpoint defined in `load_from`. If `load_from` is None, it will resume the latest checkpoint in the `work_dir`.
	```

	### Config System for Action localization

	We incorporate modular design into our config system,
	which is convenient to conduct various experiments.

	- An Example of BMN

	To help the users have a basic idea of a complete config structure and the modules in an action localization system,
	we make brief comments on the config of BMN as the following.
	For more detailed usage and alternative for per parameter in each module, please refer to the [API documentation](https://mmaction2.readthedocs.io/en/latest/api.html).

	```python
	# model settings
	model = dict( # Config of the model
	type='BMN', # Class name of the localizer
	temporal_dim=100, # Total frames selected for each video
	boundary_ratio=0.5, # Ratio for determining video boundaries
	num_samples=32, # Number of samples for each proposal
	num_samples_per_bin=3, # Number of bin samples for each sample
	feat_dim=400, # Dimension of feature
	soft_nms_alpha=0.4, # Soft NMS alpha
	soft_nms_low_threshold=0.5, # Soft NMS low threshold
	soft_nms_high_threshold=0.9, # Soft NMS high threshold
	post_process_top_k=100) # Top k proposals in post process

	# dataset settings
	dataset_type = 'ActivityNetDataset' # Type of dataset for training, validation and testing
	data_root = 'data/activitynet_feature_cuhk/csv_mean_100/' # Root path to data for training
	data_root_val = 'data/activitynet_feature_cuhk/csv_mean_100/' # Root path to data for validation and testing
	ann_file_train = 'data/ActivityNet/anet_anno_train.json' # Path to the annotation file for training
	ann_file_val = 'data/ActivityNet/anet_anno_val.json' # Path to the annotation file for validation
	ann_file_test = 'data/ActivityNet/anet_anno_test.json' # Path to the annotation file for testing

	train_pipeline = [ # Training data processing pipeline
	dict(type='LoadLocalizationFeature'), # Load localization feature pipeline
	dict(type='GenerateLocalizationLabels'), # Generate localization labels pipeline
	dict(
	type='PackLocalizationInputs', # Pack localization data
	keys=('gt_bbox'), # Keys of input
	meta_keys=('video_name'))] # Meta keys of input
	val_pipeline = [ # Validation data processing pipeline
	dict(type='LoadLocalizationFeature'), # Load localization feature pipeline
	dict(type='GenerateLocalizationLabels'), # Generate localization labels pipeline
	dict(
	type='PackLocalizationInputs', # Pack localization data
	keys=('gt_bbox'), # Keys of input
	meta_keys=('video_name', 'duration_second', 'duration_frame',
	'annotations', 'feature_frame'))] # Meta keys of input
	test_pipeline = [ # Testing data processing pipeline
	dict(type='LoadLocalizationFeature'), # Load localization feature pipeline
	dict(
	type='PackLocalizationInputs', # Pack localization data
	keys=('gt_bbox'), # Keys of input
	meta_keys=('video_name', 'duration_second', 'duration_frame',
	'annotations', 'feature_frame'))] # Meta keys of input
	train_dataloader = dict( # Config of train dataloader
	batch_size=8, # Batch size of each single GPU during training
	num_workers=8, # Workers to pre-fetch data for each single GPU during training
	persistent_workers=True, # If `True`, the dataloader will not shut down the worker processes after an epoch end, which can accelerate training speed
	sampler=dict(
	type='DefaultSampler', # DefaultSampler which supports both distributed and non-distributed training. Refer to https://github.com/open-mmlab/mmengine/blob/main/mmengine/dataset/sampler.py
	shuffle=True), # Randomly shuffle the training data in each epoch
	dataset=dict( # Config of train dataset
	type=dataset_type,
	ann_file=ann_file_train, # Path of annotation file
	data_prefix=dict(video=data_root), # Prefix of video path
	pipeline=train_pipeline))
	val_dataloader = dict( # Config of validation dataloader
	batch_size=1, # Batch size of each single GPU during evaluation
	num_workers=8, # Workers to pre-fetch data for each single GPU during evaluation
	persistent_workers=True, # If `True`, the dataloader will not shut down the worker processes after an epoch end
	sampler=dict(
	type='DefaultSampler',
	shuffle=False), # Not shuffle during validation and testing
	dataset=dict( # Config of validation dataset
	type=dataset_type,
	ann_file=ann_file_val, # Path of annotation file
	data_prefix=dict(video=data_root_val), # Prefix of video path
	pipeline=val_pipeline,
	test_mode=True))
	test_dataloader = dict( # Config of test dataloader
	batch_size=1, # Batch size of each single GPU during testing
	num_workers=8, # Workers to pre-fetch data for each single GPU during testing
	persistent_workers=True, # If `True`, the dataloader will not shut down the worker processes after an epoch end
	sampler=dict(
	type='DefaultSampler',
	shuffle=False), # Not shuffle during validation and testing
	dataset=dict( # Config of test dataset
	type=dataset_type,
	ann_file=ann_file_val, # Path of annotation file
	data_prefix=dict(video=data_root_val), # Prefix of video path
	pipeline=test_pipeline,
	test_mode=True))

	# evaluation settings
	work_dir = './work_dirs/bmn_400x100_2x8_9e_activitynet_feature/' # Directory to save the model checkpoints and logs for the current experiments
	val_evaluator = dict(
	type='ANetMetric',
	metric_type='AR@AN',
	dump_config=dict( # Config of localization output
	out=f'{work_dir}/results.json', # Path to the output file
	output_format='json')) # File format of the output file
	test_evaluator = val_evaluator # Set test_evaluator as val_evaluator

	max_epochs = 9 # Total epochs to train the model
	train_cfg = dict( # Config of training loop
	type='EpochBasedTrainLoop', # Name of training loop
	max_epochs=max_epochs, # Total training epochs
	val_begin=1, # The epoch that begins validating
	val_interval=1) # Validation interval
	val_cfg = dict( # Config of validation loop
	type='ValLoop') # Name of validating loop
	test_cfg = dict( # Config of testing loop
	type='TestLoop') # Name of testing loop

	# learning policy
	param_scheduler = [ # Parameter scheduler for updating optimizer parameters, support dict or list
	dict(type='MultiStepLR', # Decays the learning rate once the number of epoch reaches one of the milestones
	begin=0, # Step at which to start updating the learning rate
	end=max_epochs, # Step at which to stop updating the learning rate
	by_epoch=True, # Whether the scheduled learning rate is updated by epochs
	milestones=[7, ], # Steps to decay the learning rate
	gamma=0.1)] # Multiplicative factor of parameter value decay

	# optimizer
	optim_wrapper = dict( # Config of optimizer wrapper
	type='OptimWrapper', # Name of optimizer wrapper, switch to AmpOptimWrapper to enable mixed precision training
	optimizer=dict( # Config of optimizer. Support all kinds of optimizers in PyTorch. Refer to https://pytorch.org/docs/stable/optim.html#algorithms
	type='Adam', # Name of optimizer
	lr=0.001, # Learning rate
	weight_decay=0.0001), # Weight decay
	clip_grad=dict(max_norm=40, norm_type=2)) # Config of gradient clip

	# runtime settings
	default_scope = 'mmaction' # The default registry scope to find modules. Refer to https://mmengine.readthedocs.io/en/latest/tutorials/registry.html
	default_hooks = dict( # Hooks to execute default actions like updating model parameters and saving checkpoints.
	runtime_info=dict(type='RuntimeInfoHook'), # The hook to updates runtime information into message hub
	timer=dict(type='IterTimerHook'), # The logger used to record time spent during iteration
	logger=dict(
	type='LoggerHook', # The logger used to record logs during training/validation/testing phase
	interval=20, # Interval to print the log
	ignore_last=False), # Ignore the log of last iterations in each epoch
	param_scheduler=dict(type='ParamSchedulerHook'), # The hook to update some hyper-parameters in optimizer
	checkpoint=dict(
	type='CheckpointHook', # The hook to save checkpoints periodically
	interval=3, # The saving period
	save_best='auto', # Specified metric to mearsure the best checkpoint during evaluation
	max_keep_ckpts=3), # The maximum checkpoints to keep
	sampler_seed=dict(type='DistSamplerSeedHook'), # Data-loading sampler for distributed training
	sync_buffers=dict(type='SyncBuffersHook')) # Synchronize model buffers at the end of each epoch
	env_cfg = dict( # Dict for setting environment
	cudnn_benchmark=False, # Whether to enable cudnn benchmark
	mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0), # Parameters to setup multiprocessing
	dist_cfg=dict(backend='nccl')) # Parameters to setup distributed environment, the port can also be set

	log_processor = dict(
	type='LogProcessor', # Log processor used to format log information
	window_size=20, # Default smooth interval
	by_epoch=True) # Whether to format logs with epoch type
	vis_backends = [ # List of visualization backends
	dict(type='LocalVisBackend')] # Local visualization backend
	visualizer = dict( # Config of visualizer
	type='ActionVisualizer', # Name of visualizer
	vis_backends=vis_backends)
	log_level = 'INFO' # The level of logging
	load_from = None # Load model checkpoint as a pre-trained model from a given path. This will not resume training.
	resume = False # Whether to resume from the checkpoint defined in `load_from`. If `load_from` is None, it will resume the latest checkpoint in the `work_dir`.
	```