Spaces:
Sleeping
Visualization
Before reading this tutorial, it is recommended to read MMEngine's {external+mmengine:doc}MMEngine: Visualization <advanced_tutorials/visualization> documentation to get a first glimpse of the Visualizer definition and usage.
In brief, the Visualizer is implemented in MMEngine to meet the daily visualization needs, and contains three main functions:
- Implement common drawing APIs, such as
draw_bboxeswhich implements bounding box drawing functions,draw_linesimplements the line drawing function. - Support writing visualization results, learning rate curves, loss function curves, and verification accuracy curves to various backends, including local disks and common deep learning training logging tools such as TensorBoard and Wandb.
- Support calling anywhere in the code to visualize or record intermediate states of the model during training or testing, such as feature maps and validation results.
Based on MMEngine's Visualizer, MMOCR comes with a variety of pre-built visualization tools that can be used by the user by simply modifying the following configuration files.
- The
tools/analysis_tools/browse_dataset.pyscript provides a dataset visualization function that draws images and corresponding annotations after Data Transforms, as described inbrowse_dataset.py. - MMEngine implements
LoggerHook, which usesVisualizerto write the learning rate, loss and evaluation results to the backend set byVisualizer. Therefore, by modifying theVisualizerbackend in the configuration file, for example toTensorBoardVISBackendorWandbVISBackend, you can implement logging to common training logging tools such asTensorBoardorWandB, thus making it easy for users to use these visualization tools to analyze and monitor the training process. - The
VisualizerHookis implemented in MMOCR, which uses theVisualizerto visualize or store the prediction results of the validation or prediction phase into the backend set by theVisualizer, so by modifying theVisualizerbackend in the configuration file, for example, toTensorBoardVISBackendorWandbVISBackend, you can implement storing the predicted images toTensorBoardorWandb.
Configuration
Thanks to the use of the registration mechanism, in MMOCR we can set the behavior of the Visualizer by modifying the configuration file. Usually, we define the default configuration for the visualizer in task/_base_/default_runtime.py, see configuration tutorial for details.
vis_backends = [dict(type='LocalVisBackend')]
visualizer = dict(
type='TextxxxLocalVisualizer', # use different visualizers for different tasks
vis_backends=vis_backends,
name='visualizer')
Based on the above example, we can see that the configuration of Visualizer consists of two main parts, namely, the type of Visualizer and the visualization backend vis_backends it uses.
- For different OCR tasks, various visualizers are pre-configured in MMOCR, including
TextDetLocalVisualizer,TextRecogLocalVisualizer,TextSpottingLocalVisualizerandKIELocalVisualizer. These visualizers extend the basic Visulizer API according to the characteristics of their tasks and implement the corresponding tag information interfaceadd_datasamples. For example, users can directly useTextDetLocalVisualizerto visualize labels or predictions for text detection tasks. - MMOCR sets the visualization backend
vis_backendto the local visualization backendLocalVisBackendby default, saving all visualization results and other training information in a local folder.
Storage
MMOCR uses the local visualization backend LocalVisBackend by default, and the model loss, learning rate, model evaluation accuracy and visualization The information stored in VisualizerHook and LoggerHook, including loss, learning rate, evaluation accuracy will be saved to the {work_dir}/{config_name}/{time}/{vis_data} folder by default. In addition, MMOCR also supports other common visualization backends, such as TensorboardVisBackend and WandbVisBackend, and you only need to change the vis_backends type in the configuration file to the corresponding visualization backend. For example, you can store data to TensorBoard and Wandb by simply inserting the following code block into the configuration file.
_base_.visualizer.vis_backends = [
dict(type='LocalVisBackend'),
dict(type='TensorboardVisBackend'),
dict(type='WandbVisBackend'),]
Plot
Plot the prediction results
MMOCR mainly uses VisualizationHook to plot the prediction results of validation and test, by default VisualizationHook is off, and the default configuration is as follows.
visualization=dict( # user visualization of validation and test results
type='VisualizationHook',
enable=False,
interval=1,
show=False,
draw_gt=False,
draw_pred=False)
The following table shows the parameters supported by VisualizationHook.
| Parameters | Description |
|---|---|
| enable | The VisualizationHook is turned on and off by the enable parameter, which is the default state. |
| interval | Controls how much iteration to store or display the results of a val or test if VisualizationHook is enabled. |
| show | Controls whether to visualize the results of val or test. |
| draw_gt | Whether the results of val or test are drawn with or without labeling information |
| draw_pred | whether to draw predictions for val or test results |
If you want to enable VisualizationHook related functions and configurations during training or testing, you only need to modify the configuration, take dbnet_resnet18_fpnc_1200e_icdar2015.py as an example, draw annotations and predictions at the same time, and display the images, the configuration can be modified as follows
visualization = _base_.default_hooks.visualization
visualization.update(
dict(enable=True, show=True, draw_gt=True, draw_pred=True))
If you only want to see the predicted result information you can just let draw_pred=True
visualization = _base_.default_hooks.visualization
visualization.update(
dict(enable=True, show=True, draw_gt=False, draw_pred=True))
The test.py procedure is further simplified by providing the --show and --show-dir parameters to visualize the annotation and prediction results during the test without modifying the configuration.
# Show test results
python tools/test.py configs/textdet/dbnet/dbnet_resnet18_fpnc_1200e_icdar2015.py dbnet_r18_fpnc_1200e_icdar2015/epoch_400.pth --show
# Specify where to store the prediction results
python tools/test.py configs/textdet/dbnet/dbnet_resnet18_fpnc_1200e_icdar2015.py dbnet_r18_fpnc_1200e_icdar2015/epoch_400.pth --show-dir imgs/