[Docs] vis doc (#1353)

* vis doc

* fix comment

* fix link
pull/1344/head
liukuikun 2022-08-31 21:28:29 +08:00 committed by GitHub
parent c91b028772
commit dbb346afed
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 212 additions and 0 deletions

View File

@ -1 +1,107 @@
# Visualization
Before reading this tutorial, it is recommended to read MMEngine's [Visualization](https://github.com/open-mmlab/mmengine/blob/main/docs/en/tutorials/visualization.md) documentation to get a first glimpse of the `Visualizer` definition and usage.
In brief, the [`Visualizer`](mmengine.visualization.Visualizer) is implemented in MMEngine to meet the daily visualization needs, and contains three main functions:
- Implement common drawing APIs, such as [`draw_bboxes`](mmengine.visualization.Visualizer.draw_bboxes) which implements bounding box drawing functions, [`draw_lines`](mmengine.visualization.Visualizer.draw_lines) implements the line drawing function.
- Support writing visualization results, learning rate curves, loss function curves, and verification accuracy curves to various backends, including local disks and common deep learning training logging tools such as [TensorBoard](https://www.tensorflow.org/tensorboard) and [Wandb](https://wandb.ai/site).
- Support calling anywhere in the code to visualize or record intermediate states of the model during training or testing, such as feature maps and validation results.
Based on MMEngine's Visualizer, MMOCR comes with a variety of pre-built visualization tools that can be used by the user by simply modifying the following configuration files.
- The `tools/analysis_tools/browse_dataset.py` script provides a dataset visualization function that draws images and corresponding annotations after Data Transforms, as described in [`browse_dataset.py`](useful_tools.md).
- MMEngine implements `LoggerHook`, which uses `Visualizer` to write the learning rate, loss and evaluation results to the backend set by `Visualizer`. Therefore, by modifying the `Visualizer` backend in the configuration file, for example to ` TensorBoardVISBackend` or `WandbVISBackend`, you can implement logging to common training logging tools such as `TensorBoard` or `WandB`, thus making it easy for users to use these visualization tools to analyze and monitor the training process.
- The `VisualizerHook` is implemented in MMOCR, which uses the `Visualizer` to visualize or store the prediction results of the validation or prediction phase into the backend set by the `Visualizer`, so by modifying the `Visualizer` backend in the configuration file, for example, to ` TensorBoardVISBackend` or `WandbVISBackend`, you can implement storing the predicted images to `TensorBoard` or `Wandb`.
## Configuration
Thanks to the use of the registration mechanism, in MMOCR we can set the behavior of the `Visualizer` by modifying the configuration file. Usually, we define the default configuration for the visualizer in `task/_base_/default_runtime.py`, see [configuration tutorial](config.md) for details.
```Python
vis_backends = [dict(type='LocalVisBackend')]
visualizer = dict(
type='TextxxxLocalVisualizer', # use different visualizers for different tasks
vis_backends=vis_backends,
name='visualizer')
```
Based on the above example, we can see that the configuration of `Visualizer` consists of two main parts, namely, the type of `Visualizer` and the visualization backend `vis_backends` it uses.
- For different OCR tasks, various visualizers are pre-configured in MMOCR, including [`TextDetLocalVisualizer`](mmocr.visualization.TextDetLocalVisualizer), [`TextRecogLocalVisualizer`](mmocr.visualization.TextRecogLocalVisualizer), [`TextSpottingLocalVisualizer`](mmocr.visualization.TextSpottingLocalVisualizer) and [`KIELocalVisualizer`](mmocr.visualization.KIELocalVisualizer). These visualizers extend the basic Visulizer API according to the characteristics of their tasks and implement the corresponding tag information interface `add_datasamples`. For example, users can directly use `TextDetLocalVisualizer` to visualize labels or predictions for text detection tasks.
- MMOCR sets the visualization backend `vis_backend` to the local visualization backend `LocalVisBackend` by default, saving all visualization results and other training information in a local folder.
## Storage
MMOCR uses the local visualization backend [`LocalVisBackend`](mmengine.visualization.LocalVisBackend) by default, and the model loss, learning rate, model evaluation accuracy and visualization The information stored in `VisualizerHook` and `LoggerHook`, including loss, learning rate, evaluation accuracy will be saved to the `{work_dir}/{config_name}/{time}/{vis_data}` folder by default. In addition, MMOCR also supports other common visualization backends, such as `TensorboardVisBackend` and `WandbVisBackend`, and you only need to change the `vis_backends` type in the configuration file to the corresponding visualization backend. For example, you can store data to `TensorBoard` and `Wandb` by simply inserting the following code block into the configuration file.
```Python
_base_.Visualizer.vis_backends = [
dict(type='LocalVisBackend'),
dict(type='TensorboardVisBackend'),
dict(type='WandbVisBackend'),]
```
## Plot
### Plot the prediction results
MMOCR mainly uses [`VisualizationHook`](mmocr.engine.hooks.VisualizationHook) to plot the prediction results of validation and test, by default `VisualizationHook` is off, and the default configuration is as follows.
```Python
visualization=dict( # user visualization of validation and test results
type='VisualizationHook',
enable=False,
interval=1,
show=False,
draw_gt=False,
draw_pred=False)
```
The following table shows the parameters supported by `VisualizationHook`.
| Parameters | Description |
| :--------: | :-----------------------------------------------------------------------------------------------------------: |
| enable | The VisualizationHook is turned on and off by the enable parameter, which is the default state. |
| interval | Controls how much iteration to store or display the results of a val or test if VisualizationHook is enabled. |
| show | Controls whether to visualize the results of val or test. |
| draw_gt | Whether the results of val or test are drawn with or without labeling information |
| draw_pred | whether to draw predictions for val or test results |
If you want to enable `VisualizationHook` related functions and configurations during training or testing, you only need to modify the configuration, take `dbnet_resnet18_fpnc_1200e_icdar2015.py` as an example, draw annotations and predictions at the same time, and display the images, the configuration can be modified as follows
```Python
visualization = _base_.default_hooks.visualization
visualization.update(
dict(enable=True, show=True, draw_gt=True, draw_pred=True))
```
<div align=center>
<img src="https://user-images.githubusercontent.com/24622904/187426573-8448c827-1336-4416-aebc-e7fccce362cd.png" height="200"/>
</div>
If you only want to see the predicted result information you can just let `draw_pred=True`
```Python
visualization = _base_.default_hooks.visualization
visualization.update(
dict(enable=True, show=True, draw_gt=False, draw_pred=True))
```
<div align=center>
<img src="https://user-images.githubusercontent.com/24622904/187428385-e6a23120-6445-4c55-a265-c550da692087.png" height="300"/>
</div>
The `test.py` procedure is further simplified by providing the `--show` and `--show-dir` parameters to visualize the annotation and prediction results during the test without modifying the configuration.
```Shell
# Show test results
python tools/test.py configs/textdet/dbnet/dbnet_resnet18_fpnc_1200e_icdar2015.py dbnet_r18_fpnc_1200e_icdar2015/epoch_400.pth --show
# Specify where to store the prediction results
python tools/test.py configs/textdet/dbnet/dbnet_resnet18_fpnc_1200e_icdar2015.py dbnet_r18_fpnc_1200e_icdar2015/epoch_400.pth --show-dir imgs/
```
<div align=center>
<img src="https://user-images.githubusercontent.com/24622904/187426573-8448c827-1336-4416-aebc-e7fccce362cd.png" height="200"/>
</div>

View File

@ -1 +1,107 @@
# 可视化
阅读本文前建议先阅读 MMEngine 的[可视化 (Visualization)](https://github.com/open-mmlab/mmengine/blob/main/docs/zh_cn/advanced_tutorials/visualization.md)文档以初步了解 Visualizer 的定义及相关用法。
简单来说MMEngine 中实现了用于满足日常可视化需求的可视化器件 [`Visualizer`](mmengine.visualization.Visualizer),其主要包含三个功能:
- 实现了常用的绘图 API例如 [`draw_bboxes`](mmengine.visualization.Visualizer.draw_bboxes) 实现了边界盒的绘制功能,[`draw_lines`](mmengine.visualization.Visualizer.draw_lines) 实现了线条的绘制功能。
- 支持将可视化结果、学习率曲线、损失函数曲线以及验证精度曲线等写入多种后端中,包括本地磁盘以及常用的深度学习训练日志记录工具,如 [TensorBoard](https://www.tensorflow.org/tensorboard) 和 [WandB](https://wandb.ai/site)。
- 支持在代码中的任意位置进行调用,例如在训练或测试过程中可视化或记录模型的中间状态,如特征图及验证结果等。
基于 MMEngine 的 VisualizerMMOCR 内预置了多种可视化工具,用户仅需简单修改配置文件即可使用:
- `tools/analysis_tools/browse_dataset.py` 脚本提供了数据集可视化功能其可以绘制经过数据变换Data Transforms之后的图像及对应的标注内容详见 [`browse_dataset.py`](useful_tools.md)。
- MMEngine 中实现了 `LoggerHook`,该 Hook 利用 `Visualizer` 将学习率、损失以及评估结果等数据写入 `Visualizer` 设置的后端中,因此通过修改配置文件中的 `Visualizer` 后端,比如修改为`TensorBoardVISBackend` 或 `WandbVISBackend`,可以实现将日志到 `TensorBoard``WandB` 等常见的训练日志记录工具中,从而方便用户使用这些可视化工具来分析和监控训练流程。
- MMOCR 中实现了`VisualizerHook`,该 Hook 利用 `Visualizer` 将验证阶段或预测阶段的预测结果进行可视化或储存至 `Visualizer` 设置的后端中,因此通过修改配置文件中的 `Visualizer` 后端,比如修改为`TensorBoardVISBackend` 或 `WandbVISBackend`,可以实现将预测的图像存储到 `TensorBoard``Wandb`中。
## 配置
得益于注册机制的使用,在 MMOCR 中,我们可以通过修改配置文件来设置可视化器件 `Visualizer` 的行为。通常,我们在 `task/_base_/default_runtime.py` 中定义可视化相关的默认配置, 详见[配置教程](config.md)。
```Python
vis_backends = [dict(type='LocalVisBackend')]
visualizer = dict(
type='TextxxxLocalVisualizer', # 不同任务使用不同的可视化器
vis_backends=vis_backends,
name='visualizer')
```
依据以上示例,我们可以看出 `Visualizer` 的配置主要由两个部分组成,即,`Visualizer`的类型以及其采用的可视化后端 `vis_backends`
- 针对不同的 OCR 任务MMOCR 中预置了多种可视化器件,包括 [`TextDetLocalVisualizer`](mmocr.visualization.TextDetLocalVisualizer)[`TextRecogLocalVisualizer`](mmocr.visualization.TextRecogLocalVisualizer)[`TextSpottingLocalVisualizer`](mmocr.visualization.TextSpottingLocalVisualizer) 以及[`KIELocalVisualizer`](mmocr.visualization.KIELocalVisualizer)。这些可视化器件依照自身任务的特点对基础的 Visulizer API 进行了拓展,并实现了相应的标签信息接口 `add_datasamples`。例如,用户可以直接使用 `TextDetLocalVisualizer` 来可视化文本检测任务的标签或预测结果。
- MMOCR 默认将可视化后端 `vis_backend` 设置为本地可视化后端 `LocalVisBackend`,将所有可视化结果及其他训练信息保存在本地文件夹中。
## 存储
MMOCR 默认使用本地可视化后端 [`LocalVisBackend`](mmengine.visualization.LocalVisBackend)`VisualizerHook` 和`LoggerHook` 中存储的模型损失、学习率、模型评估精度以及可视化结果等信息将被默认保存至`{work_dir}/{config_name}/{time}/{vis_data}` 文件夹。此外MMOCR 也支持其它常用的可视化后端,如 `TensorboardVisBackend` 以及 `WandbVisBackend`用户只需要将配置文件中的 `vis_backends` 类型修改为对应的可视化后端即可。例如,用户只需要在配置文件中插入以下代码块,即可将数据存储至 `TensorBoard` 以及 `WandB`中。
```Python
_base_.Visualizer.vis_backends = [
dict(type='LocalVisBackend'),
dict(type='TensorboardVisBackend'),
dict(type='WandbVisBackend'),]
```
## 绘制
### 绘制预测结果信息
MMOCR 主要利用 [`VisualizationHook`](mmocr.engine.hooks.VisualizationHook)validation 和 test 的预测结果, 默认情况下 `VisualizationHook`为关闭状态,默认配置如下:
```Python
visualization=dict( # 用户可视化 validation 和 test 的结果
type='VisualizationHook',
enable=False,
interval=1,
show=False,
draw_gt=False,
draw_pred=False)
```
下表为 `VisualizationHook` 支持的参数:
| 参数 | 说明 |
| :-------: | :---------------------------------------------------------------------------------: |
| enable | VisualizationHook 的开启和关闭由参数enable控制默认是关闭的状态 |
| interval | 在VisualizationHook开启的情况下,用以控制多少iteration 存储或展示 val 或 test 的结果 |
| show | 控制是否可视化 val 或 test 的结果 |
| draw_gt | val 或 test 的结果是否绘制标注信息 |
| draw_pred | val 或 test 的结果是否绘制预测结果 |
如果在训练或者测试过程中想开启 `VisualizationHook` 相关功能和配置,仅需修改配置即可,以 `dbnet_resnet18_fpnc_1200e_icdar2015.py`为例, 同时绘制标注和预测,并且将图像展示,配置可进行如下修改
```Python
visualization = _base_.default_hooks.visualization
visualization.update(
dict(enable=True, show=True, draw_gt=True, draw_pred=True))
```
<div align=center>
<img src="https://user-images.githubusercontent.com/24622904/187426573-8448c827-1336-4416-aebc-e7fccce362cd.png" height="200"/>
</div>
如果只想查看预测结果信息可以只让`draw_pred=True`
```Python
visualization = _base_.default_hooks.visualization
visualization.update(
dict(enable=True, show=True, draw_gt=False, draw_pred=True))
```
<div align=center>
<img src="https://user-images.githubusercontent.com/24622904/187428385-e6a23120-6445-4c55-a265-c550da692087.png" height="300"/>
</div>
`test.py` 过程中进一步简化,提供了 `--show``--show-dir`两个参数,无需修改配置即可视化测试过程中绘制标注和预测结果。
```Shell
# 展示test 结果
python tools/test.py configs/textdet/dbnet/dbnet_resnet18_fpnc_1200e_icdar2015.py dbnet_r18_fpnc_1200e_icdar2015/epoch_400.pth --show
# 指定预测结果的存储位置
python tools/test.py configs/textdet/dbnet/dbnet_resnet18_fpnc_1200e_icdar2015.py dbnet_r18_fpnc_1200e_icdar2015/epoch_400.pth --show-dir imgs/
```
<div align=center>
<img src="https://user-images.githubusercontent.com/24622904/187426573-8448c827-1336-4416-aebc-e7fccce362cd.png" height="200"/>
</div>