[Refactor] Fix visualization tools. (#1045)

* update browse dataset * update images * update tools * update grad_cam * update docs * update tools * update docs
2022-09-20 18:09:05 +08:00 · 2022-09-20 18:09:05 +08:00 · 789884bf08
parent b35ee5d139
commit 789884bf08
9 changed files with 183 additions and 193 deletions
--- a/docs/en/_static/image/tools/visualization/lr_schedule1.png
+++ b/docs/en/_static/image/tools/visualization/lr_schedule1.png
--- a/docs/en/_static/image/tools/visualization/lr_schedule2.png
+++ b/docs/en/_static/image/tools/visualization/lr_schedule2.png
--- a/docs/en/user_guides/visualization.md
+++ b/docs/en/user_guides/visualization.md
@ -1,134 +1,138 @@
-# Visualization Tools (TODO)
+# Visualization Tools

 <!-- TOC -->

- [Pipeline Visualization](#pipeline-visualization)
- [Learning Rate Schedule Visualization](#learning-rate-schedule-visualization)
+- [Browse Dataset](#browse-dataset)
+- [Parameter Schedule Visualization](#parameter-schedule-visualization)
 - [Class Activation Map Visualization](#class-activation-map-visualization)
 - [FAQs](#faqs)

 <!-- TOC -->

-## Pipeline Visualization
+## Browse Dataset

 ```bash
-python tools/visualizations/vis_pipeline.py \
+python tools/visualizations/browse_dataset.py \
    ${CONFIG_FILE} \
-    [--output-dir ${OUTPUT_DIR}] \
-    [--phase ${DATASET_PHASE}] \
-    [--number ${BUNBER_IMAGES_DISPLAY}] \
-    [--skip-type ${SKIP_TRANSFORM_TYPE}] \
-    [--mode ${DISPLAY_MODE}] \
-    [--show] \
-    [--adaptive] \
-    [--min-edge-length ${MIN_EDGE_LENGTH}] \
-    [--max-edge-length ${MAX_EDGE_LENGTH}] \
-    [--bgr2rgb] \
-    [--window-size ${WINDOW_SIZE}] \
+    [-o, --output-dir ${OUTPUT_DIR}] \
+    [-p, --phase ${DATASET_PHASE}] \
+    [-n, --show-number ${NUMBER_IMAGES_DISPLAY}] \
+    [-i, --show-interval ${SHOW_INTERRVAL}] \
+    [-m, --mode ${DISPLAY_MODE}] \
+    [-r, --rescale-factor ${RESCALE_FACTOR}] \
+    [-c, --channel-order ${CHANNEL_ORDER}] \
    [--cfg-options ${CFG_OPTIONS}]
 ```

 **Description of all arguments**：

 - `config` : The path of a model config file.
- `--output-dir`: The output path for visualized images. If not specified, it will be set to `''`, which means not to save.
- `--phase`: Phase of visualizing dataset，must be one of `[train, val, test]`. If not specified, it will be set to `train`.
- `--number`: The number of samples to visualized. If not specified, display all images in the dataset.
- `--skip-type`: The pipelines to be skipped. If not specified, it will be set to `['ToTensor', 'Normalize', 'ImageToTensor', 'Collect']`.
- `--mode`: The display mode, can be one of `[original, pipeline, concat]`. If not specified, it will be set to `concat`.
- `--show`: If set, display pictures in pop-up windows.
- `--adaptive`: If set, adaptively resize images for better visualization.
- `--min-edge-length`: The minimum edge length, used when `--adaptive` is set. When any side of the picture is smaller than `${MIN_EDGE_LENGTH}`, the picture will be enlarged while keeping the aspect ratio unchanged, and the short side will be aligned to `${MIN_EDGE_LENGTH}`. If not specified, it will be set to 200.
- `--max-edge-length`: The maximum edge length, used when `--adaptive` is set. When any side of the picture is larger than `${MAX_EDGE_LENGTH}`, the picture will be reduced while keeping the aspect ratio unchanged, and the long side will be aligned to `${MAX_EDGE_LENGTH}`. If not specified, it will be set to 1000.
- `--bgr2rgb`: If set, flip the color channel order of images.
- `--window-size`: The shape of the display window. If not specified, it will be set to `12*7`. If used, it must be in the format `'W*H'`.
- `--cfg-options` : Modifications to the configuration file, refer to [Tutorial 1: Learn about Configs](https://mmclassification.readthedocs.io/en/latest/tutorials/config.html).
+- `-o, --output-dir`: The output path for visualized images. If not specified, it will be set to `''`, which means not to save.
+- **`-p, --phase`**: Phase of visualizing dataset，must be one of `['train', 'val', 'test']`. If not specified, it will be set to `'train'`.
+- **`-n, --show-number`**: The number of samples to visualized. If not specified, display all images in the dataset.
+- `--show-interval`: The interval of show (s).
+- **`-m, --mode`**: The display mode, can be one of `['original', 'transformed', 'concat', 'pipeline']`. If not specified, it will be set to `'transformed'`.
+- **`-r, --rescale-factor`**: The image rescale factor, which is useful if the output is too large or too small.
+- `-c, --channel-order`: The channel of the showing images, could be "BGR" or "RGB", If not specified, it will be set to 'BGR'.
+- `--cfg-options` : Modifications to the configuration file, refer to [Learn about Configs](./config.md).

 ```{note}
+1. The `-m, --mode` is about display mode, display original pictures or transformed pictures or comparison pictures:
+- "original" means show images load from disk;
+- "transformed" means to show images after transformed;
+- "concat" means show images stitched by "original" and "transformed" images;
+- "pipeline" means show all the intermediate images throghout the pipeline.

-1. If the `--mode` is not specified, it will be set to `concat` as default, get the pictures stitched together by original pictures and transformed pictures; if the `--mode` is set to `original`, get the original pictures; if the `--mode` is set to `transformed`, get the transformed pictures; if the `--mode` is set to `pipeline`, get all the intermediate images through the pipeline.
-
-2. When `--adaptive` option is set, images that are too large or too small will be automatically adjusted, you can use `--min-edge-length` and `--max-edge-length` to set the adjust size.
+2.  The `-r, --rescale-factor` option is set when the label information is too large or too small relative to the picture. For example, when visualizing the CIFAR dataset, since the resolution of the image is very small, `--rescale-factor` can be set to 10.
 ```

 **Examples**：

-1. In **'original'** mode, visualize 100 original pictures in the `CIFAR100` validation set, then display and save them in the `./tmp` folder：
+1. In **'original'** mode:

 ```shell
-python ./tools/visualizations/vis_pipeline.py configs/resnet/resnet50_8xb16_cifar100.py --phase val --output-dir tmp --mode original --number 100  --show --adaptive --bgr2rgb
+python ./tools/visualizations/browse_dataset.py ./configs/resnet/resnet101_8xb16_cifar10.py --phase val --output-dir tmp --mode original --show-number 100 --rescale-factor 10 --channel-order RGB
 ```

-<div align=center><img src="https://user-images.githubusercontent.com/18586273/146117528-1ec2d918-57f8-4ae4-8ca3-a8d31b602f64.jpg" style=" width: auto; height: 40%; "></div>
+- `--phase val`: Visual validation set, can be simplified to `-p val`;
+- `--output-dir tmp`: The visualization results are saved in the "tmp" folder, can be simplified to `-o tmp`;
+- `--mode original`: Visualize the original image, can be simplified to `-m original`;
+- `--show-number 100`: visualize 100 images, can be simplified to `-n 100`;
+- `--rescale-factor`: the image is enlarged by 10 times, can be simplified to `-r 10`;
+- `--channel-order RGB`: The channel order of the visualized image is "RGB", can be simplified to `-c RGB`.

-2. In **'transformed'** mode, visualize all the transformed pictures of the `ImageNet` training set and display them in pop-up windows：
+<div align=center><img src="https://user-images.githubusercontent.com/18586273/190993839-216a7a1e-590e-47b9-92ae-08f87a7d58df.jpg" style=" width: auto; height: 40%; "></div>
+
+2. In **'transformed'** mode:

 ```shell
-python ./tools/visualizations/vis_pipeline.py ./configs/resnet/resnet50_8xb32_in1k.py --show --mode transformed
+python ./tools/visualizations/browse_dataset.py ./configs/resnet/resnet50_8xb32_in1k.py -n 100 -r 2
 ```

-<div align=center><img src="https://user-images.githubusercontent.com/18586273/146117553-8006a4ba-e2fa-4f53-99bc-42a4b06e413f.jpg" style=" width: auto; height: 40%; "></div>
+<div align=center><img src="https://user-images.githubusercontent.com/18586273/190994696-737b09d9-d0fb-4593-94a2-4487121e0286.JPEG" style=" width: auto; height: 40%; "></div>

-3. In **'concat'** mode, visualize 10 pairs of origin and transformed images for comparison in the `ImageNet` train set and save them in the `./tmp` folder：
+3. In **'concat'** mode:

 ```shell
-python ./tools/visualizations/vis_pipeline.py configs/swin_transformer/swin_base_224_b16x64_300e_imagenet.py --phase train --output-dir tmp --number 10 --adaptive
+python ./tools/visualizations/browse_dataset.py configs/swin_transformer/swin-small_16xb64_in1k.py -n 10 -m concat
 ```

-<div align=center><img src="https://user-images.githubusercontent.com/18586273/146128259-0a369991-7716-411d-8c27-c6863e6d76ea.JPEG" style=" width: auto; height: 40%; "></div>
+<div align=center><img src="https://user-images.githubusercontent.com/18586273/190995078-3872feb2-d4e2-4727-a21b-7062d52f7d3e.JPEG" style=" width: auto; height: 40%; "></div>

-4. In **'pipeline'** mode, visualize all the intermediate pictures in the `ImageNet` train set through the pipeline：
+4. In **'pipeline'** mode：

 ```shell
-python ./tools/visualizations/vis_pipeline.py configs/swin_transformer/swin_base_224_b16x64_300e_imagenet.py --phase train --adaptive --mode pipeline --show
+python ./tools/visualizations/browse_dataset.py configs/swin_transformer/swin-small_16xb64_in1k.py -m pipeline
 ```

-<div align=center><img src="https://user-images.githubusercontent.com/18586273/146128201-eb97c2aa-a615-4a81-a649-38db1c315d0e.JPEG" style=" width: auto; height: 40%; "></div>
+<div align=center><img src="https://user-images.githubusercontent.com/18586273/190995525-fac0220f-6630-4013-b94a-bc6de4fdff7a.JPEG" style=" width: auto; height: 40%; "></div>

-## Learning Rate Schedule Visualization
+## Parameter Schedule Visualization

 ```bash
-python tools/visualizations/vis_lr.py \
+python tools/visualizations/vis_scheduler.py \
    ${CONFIG_FILE} \
-    --dataset-size ${DATASET_SIZE} \
-    --ngpus ${NUM_GPUs}
-    --save-path ${SAVE_PATH} \
-    --title ${TITLE} \
-    --style ${STYLE} \
-    --window-size ${WINDOW_SIZE}
-    --cfg-options
+    [-p, --parameter ${PARAMETER_NAME}] \
+    [-d, --dataset-size ${DATASET_SIZE}] \
+    [-n, --ngpus ${NUM_GPUs}] \
+    [-s, --save-path ${SAVE_PATH}] \
+    [--title ${TITLE}] \
+    [--style ${STYLE}] \
+    [--window-size ${WINDOW_SIZE}] \
+    [--cfg-options]
 ```

 **Description of all arguments**：

- `config` :  The path of a model config file.
- `dataset-size` : The size of the datasets. If set，`build_dataset` will be skipped and `${DATASET_SIZE}` will be used as the size. Default to use the function `build_dataset`.
- `ngpus` : The number of GPUs used in training, default to be 1.
- `save-path` : The learning rate curve plot save path, default not to save.
- `title` : Title of figure. If not set, default to be config file name.
- `style` : Style of plt. If not set, default to be `whitegrid`.
- `window-size`: The shape of the display window. If not specified, it will be set to `12*7`. If used, it must be in the format `'W*H'`.
- `cfg-options` : Modifications to the configuration file, refer to [Tutorial 1: Learn about Configs](https://mmclassification.readthedocs.io/en/latest/tutorials/config.html).
+- `config`: The path of a model config file.
+- **`-p, --parameter`**: The param to visualize its change curve, choose from "lr" and "momentum". Default to use "lr".
+- **`-d, --dataset-size`**: The size of the datasets. If set，`build_dataset` will be skipped and `${DATASET_SIZE}` will be used as the size. Default to use the function `build_dataset`.
+- **`-n, --ngpus`**: The number of GPUs used in training, default to be 1.
+- **`-s, --save-path`**: The learning rate curve plot save path, default not to save.
+- `--title`: Title of figure. If not set, default to be config file name.
+- `--style`: Style of plt. If not set, default to be `whitegrid`.
+- `--window-size`: The shape of the display window. If not specified, it will be set to `12*7`. If used, it must be in the format `'W*H'`.
+- `--cfg-options`: Modifications to the configuration file, refer to [Learn about Configs](./config.md).

 ```{note}
-Loading annotations maybe consume much time, you can directly specify the size of the dataset with `dataset-size` to save time.
+Loading annotations maybe consume much time, you can directly specify the size of the dataset with `-d, dataset-size` to save time.
 ```

 **Examples**：

 ```bash
-python tools/visualizations/vis_lr.py configs/resnet/resnet50_b16x8_cifar100.py
+python tools/visualizations/vis_scheduler.py configs/resnet/resnet50_b16x8_cifar100.py
 ```

-<div align=center><img src="../_static/image/tools/visualization/lr_schedule1.png" style=" width: auto; height: 40%; "></div>
+<div align=center><img src="https://user-images.githubusercontent.com/18586273/191006713-023f065d-d366-4165-a52e-36176367506e.png" style=" width: auto; height: 40%; "></div>

 When using ImageNet, directly specify the size of ImageNet, as below:

 ```bash
-python tools/visualizations/vis_lr.py configs/repvgg/repvgg-B3g4_4xb64-autoaug-lbs-mixup-coslr-200e_in1k.py --dataset-size 1281167 --ngpus 4 --save-path ./repvgg-B3g4_4xb64-lr.jpg
+python tools/visualizations/vis_scheduler.py configs/repvgg/repvgg-B3g4_4xb64-autoaug-lbs-mixup-coslr-200e_in1k.py --dataset-size 1281167 --ngpus 4 --save-path ./repvgg-B3g4_4xb64-lr.jpg
 ```

-<div align=center><img src="../_static/image/tools/visualization/lr_schedule2.png" style=" width: auto; height: 40%; "></div>
+<div align=center><img src="https://user-images.githubusercontent.com/18586273/191006721-0f680e07-355e-4cd6-889c-86c0cad9acb7.png" style=" width: auto; height: 40%; "></div>

 ## Class Activation Map Visualization

@ -180,7 +184,7 @@ python tools/visualizations/vis_cam.py \
 - `--aug_smooth` : Whether to use TTA(Test Time Augment) to get CAM.
 - `--eigen_smooth` : Whether to use the principal component to reduce noise.
 - `--device` : The computing device used. Default to 'cpu'.
- `--cfg-options` : Modifications to the configuration file, refer to [Tutorial 1: Learn about Configs](https://mmclassification.readthedocs.io/en/latest/tutorials/config.html).
+- `--cfg-options` : Modifications to the configuration file, refer to [Learn about Configs](./config.md).

 ```{note}
 The argument `--preview-model` can view all network layers names in the given model. It will be helpful if you know nothing about the model layers when setting `--target-layers`.
@ -237,7 +241,7 @@ For example, the `backbone.layer4[-1]` is the same as `backbone.layer4.2` since
   ```shell
   python tools/visualizations/vis_cam.py \
       demo/dog.jpg  \
-       configs/mobilenet_v3/mobilenet-v3-large_8xb32_in1k.py \
+       configs/mobilenet_v3/mobilenet-v3-large_8xb128_in1k.py \
       https://download.openmmlab.com/mmclassification/v0/mobilenet_v3/convert/mobilenet_v3_large-3ea3c186.pth \
       --target-layers 'backbone.layer16' \
       --method LayerCAM \
--- a/docs/zh_CN/_static/image/tools/visualization/lr_schedule1.png
+++ b/docs/zh_CN/_static/image/tools/visualization/lr_schedule1.png
--- a/docs/zh_CN/_static/image/tools/visualization/lr_schedule2.png
+++ b/docs/zh_CN/_static/image/tools/visualization/lr_schedule2.png
--- a/docs/zh_CN/user_guides/visualization.md
+++ b/docs/zh_CN/user_guides/visualization.md
@ -1,136 +1,141 @@
-# 可视化工具（待更新）
+# 可视化工具

 <!-- TOC -->

- [数据流水线可视化](#数据流水线可视化)
- [学习率策略可视化](#学习率策略可视化)
+- [浏览数据集](#浏览数据集)
+- [优化器参数策略可视化](#优化器参数策略可视化)
 - [类别激活图可视化](#类别激活图可视化)
 - [常见问题](#常见问题)

 <!-- TOC -->

-## 数据流水线可视化
+## 浏览数据集

 ```bash
-python tools/visualizations/vis_pipeline.py \
+python tools/visualizations/browse_dataset.py \
    ${CONFIG_FILE} \
-    [--output-dir ${OUTPUT_DIR}] \
-    [--phase ${DATASET_PHASE}] \
-    [--number ${BUNBER_IMAGES_DISPLAY}] \
-    [--skip-type ${SKIP_TRANSFORM_TYPE}] \
-    [--mode ${DISPLAY_MODE}] \
-    [--show] \
-    [--adaptive] \
-    [--min-edge-length ${MIN_EDGE_LENGTH}] \
-    [--max-edge-length ${MAX_EDGE_LENGTH}] \
-    [--bgr2rgb] \
-    [--window-size ${WINDOW_SIZE}] \
+    [-o, --output-dir ${OUTPUT_DIR}] \
+    [-p, --phase ${DATASET_PHASE}] \
+    [-n, --show-number ${NUMBER_IMAGES_DISPLAY}] \
+    [-i, --show-interval ${SHOW_INTERRVAL}] \
+    [-m, --mode ${DISPLAY_MODE}] \
+    [-r, --rescale-factor ${RESCALE_FACTOR}] \
+    [-c, --channel-order ${CHANNEL_ORDER}] \
    [--cfg-options ${CFG_OPTIONS}]
 ```

 **所有参数的说明**：

 - `config` : 模型配置文件的路径。
- `--output-dir`: 保存图片文件夹，如果没有指定，默认为 `''`,表示不保存。
- `--phase`: 可视化数据集的阶段，只能为 `[train, val, test]` 之一，默认为 `train`。
- `--number`: 可视化样本数量。如果没有指定，默认展示数据集的所有图片。
- `--skip-type`: 预设跳过的数据流水线过程。如果没有指定，默认为 `['ToTensor', 'Normalize', 'ImageToTensor', 'Collect']`。
- `--mode`: 可视化的模式，只能为 `[original, transformed, concat, pipeline]` 之一，如果没有指定，默认为 `concat`。
- `--show`: 将可视化图片以弹窗形式展示。
- `--adaptive`: 自动调节可视化图片的大小。
- `--min-edge-length`: 最短边长度，当使用了 `--adaptive` 时有效。 当图片任意边小于 `${MIN_EDGE_LENGTH}` 时，会保持长宽比不变放大图片，短边对齐至 `${MIN_EDGE_LENGTH}`，默认为200。
- `--max-edge-length`: 最长边长度，当使用了 `--adaptive` 时有效。 当图片任意边大于 `${MAX_EDGE_LENGTH}` 时，会保持长宽比不变缩小图片，短边对齐至 `${MAX_EDGE_LENGTH}`，默认为1000。
- `--bgr2rgb`: 将图片的颜色通道翻转。
- `--window-size`: 可视化窗口大小，如果没有指定，默认为 `12*7`。如果需要指定，按照格式 `'W*H'`。
- `--cfg-options` : 对配置文件的修改，参考[教程 1：如何编写配置文件](https://mmclassification.readthedocs.io/zh_CN/latest/tutorials/config.html)。
+- `-o, --output-dir`: 保存图片文件夹，如果没有指定，默认为 `''`,表示不保存。
+- **`-p, --phase`**: 可视化数据集的阶段，只能为 `['train', 'val', 'test']` 之一，默认为 `'train'`。
+- **`-n, --show-number`**: 可视化样本数量。如果没有指定，默认展示数据集的所有图片。
+- `-i, --show-interval`: 浏览时，每张图片的停留间隔，单位为秒。
+- **`-m, --mode`**: 可视化的模式，只能为 `['original', 'transformed', 'concat', 'pipeline']` 之一。 默认为`'transformed'`.
+- **`-r, --rescale-factor`**: 对可视化图片的放缩倍数，在图片过大或过小时设置。
+- `-c, --channel-order`: 图片的通道顺序，为  `['BGR', 'RGB']` 之一，默认为 `'BGR'`。
+- `--cfg-options` : 对配置文件的修改，参考[学习配置文件](./config.md)。

 ```{note}

-1. 如果不指定 `--mode`，默认设置为 `concat`，获取原始图片和预处理后图片拼接的图片；如果 `--mode` 设置为 `original`，则获取原始图片；如果 `--mode` 设置为 `transformed`，则获取预处理后的图片；如果 `--mode` 设置为 `pipeline`，则获得数据流水线所有中间过程图片。
+1. `-m, --mode` 用于设置可视化的模式，默认设置为 'transformed'。
+- 如果 `--mode` 设置为 'original'，则获取原始图片；
+- 如果 `--mode` 设置为 'transformed'，则获取预处理后的图片；
+- 如果 `--mode` 设置为 'concat'，获取原始图片和预处理后图片拼接的图片；
+- 如果 `--mode` 设置为 'pipeline'，则获得数据流水线所有中间过程图片。

-2. 当指定了 `--adaptive` 选项时，会自动的调整尺寸过大和过小的图片，你可以通过设定 `--min-edge-length` 与 `--max-edge-length` 来指定自动调整的图片尺寸。
+2. `-r, --rescale-factor` 在数据集中图片的分辨率过大或者过小时设置。比如在可视化 CIFAR 数据集时，由于图片的分辨率非常小，可将 `-r, --rescale-factor` 设置为 10。
 ```

 **示例**：

-1. **'original'** 模式，可视化 `CIFAR100` 验证集中的100张原始图片，显示并保存在 `./tmp` 文件夹下：
+1. **'original'** 模式 ：

 ```shell
-python ./tools/visualizations/vis_pipeline.py configs/resnet/resnet50_8xb16_cifar100.py --phase val --output-dir tmp --mode original --number 100 --show --adaptive --bgr2rgb
+python ./tools/visualizations/browse_dataset.py ./configs/resnet/resnet101_8xb16_cifar10.py --phase val --output-dir tmp --mode original --show-number 100 --rescale-factor 10 --channel-order RGB
 ```

-<div align=center><img src="https://user-images.githubusercontent.com/18586273/146117528-1ec2d918-57f8-4ae4-8ca3-a8d31b602f64.jpg" style=" width: auto; height: 40%; "></div>
+- `--phase val`: 可视化验证集, 可简化为 `-p val`;
+- `--output-dir tmp`: 可视化结果保存在 "tmp" 文件夹, 可简化为 `-o tmp`;
+- `--mode original`: 可视化原图, 可简化为 `-m original`;
+- `--show-number 100`: 可视化100张图，可简化为 `-n 100`;
+- `--rescale-factor`: 图像放大10倍，可简化为 `-r 10`;
+- `--channel-order RGB`: 可视化图像的通道顺序为 "RGB", 可简化为 `-c RGB`。

-2. **'transformed'** 模式，可视化 `ImageNet` 训练集的所有经过预处理的图片，并以弹窗形式显示：
+<div align=center><img src="https://user-images.githubusercontent.com/18586273/190993839-216a7a1e-590e-47b9-92ae-08f87a7d58df.jpg" style=" width: auto; height: 40%; "></div>
+
+2. **'transformed'** 模式 ：

 ```shell
-python ./tools/visualizations/vis_pipeline.py ./configs/resnet/resnet50_8xb32_in1k.py --show --mode transformed
+python ./tools/visualizations/browse_dataset.py ./configs/resnet/resnet50_8xb32_in1k.py -n 100 -r 2
 ```

-<div align=center><img src="https://user-images.githubusercontent.com/18586273/146117553-8006a4ba-e2fa-4f53-99bc-42a4b06e413f.jpg" style=" width: auto; height: 40%; "></div>
+<div align=center><img src="https://user-images.githubusercontent.com/18586273/190994696-737b09d9-d0fb-4593-94a2-4487121e0286.JPEG" style=" width: auto; height: 40%; "></div>

-3. **'concat'** 模式，可视化 `ImageNet` 训练集的10张原始图片与预处理后图片对比图，保存在 `./tmp` 文件夹下：
+3. **'concat'** 模式 ：

 ```shell
-python ./tools/visualizations/vis_pipeline.py configs/swin_transformer/swin_base_224_b16x64_300e_imagenet.py --phase train --output-dir tmp --number 10 --adaptive
+python ./tools/visualizations/browse_dataset.py configs/swin_transformer/swin-small_16xb64_in1k.py -n 10 -m concat
 ```

-<div align=center><img src="https://user-images.githubusercontent.com/18586273/146128259-0a369991-7716-411d-8c27-c6863e6d76ea.JPEG" style=" width: auto; height: 40%; "></div>
+<div align=center><img src="https://user-images.githubusercontent.com/18586273/190995078-3872feb2-d4e2-4727-a21b-7062d52f7d3e.JPEG" style=" width: auto; height: 40%; "></div>

-4. **'pipeline'** 模式，可视化 `ImageNet` 训练集经过数据流水线的过程图像：
+4. **'pipeline'** 模式 ：

 ```shell
-python ./tools/visualizations/vis_pipeline.py configs/swin_transformer/swin_base_224_b16x64_300e_imagenet.py --phase train --adaptive --mode pipeline --show
+python ./tools/visualizations/browse_dataset.py configs/swin_transformer/swin-small_16xb64_in1k.py -m pipeline
 ```

-<div align=center><img src="https://user-images.githubusercontent.com/18586273/146128201-eb97c2aa-a615-4a81-a649-38db1c315d0e.JPEG" style=" width: auto; height: 40%; "></div>
+<div align=center><img src="https://user-images.githubusercontent.com/18586273/190995525-fac0220f-6630-4013-b94a-bc6de4fdff7a.JPEG" style=" width: auto; height: 40%; "></div>

-## 学习率策略可视化
+## 优化器参数策略可视化

 ```bash
-python tools/visualizations/vis_lr.py \
+python tools/visualizations/vis_scheduler.py \
    ${CONFIG_FILE} \
-    [--dataset-size ${Dataset_Size}] \
-    [--ngpus ${NUM_GPUs}] \
-    [--save-path ${SAVE_PATH}] \
+    [-p, --parammeter ${PARAMETER_NAME}] \
+    [-d, --dataset-size ${DATASET_SIZE}] \
+    [-n, --ngpus ${NUM_GPUs}] \
+    [-s, --save-path ${SAVE_PATH}] \
    [--title ${TITLE}] \
    [--style ${STYLE}] \
    [--window-size ${WINDOW_SIZE}] \
-    [--cfg-options ${CFG_OPTIONS}] \
+    [--cfg-options]
 ```

 **所有参数的说明**：

 - `config` : 模型配置文件的路径。
- `--dataset-size` : 数据集的大小。如果指定，`build_dataset` 将被跳过并使用这个大小作为数据集大小，默认使用 `build_dataset` 所得数据集的大小。
- `--ngpus` : 使用 GPU 的数量。
- `--save-path` : 保存的可视化图片的路径，默认不保存。
- `--title` : 可视化图片的标题，默认为配置文件名。
- `--style` : 可视化图片的风格，默认为 `whitegrid`。
+- **`-p, parameter`**: 可视化参数名，只能为 `["lr", "momentum"]` 之一， 默认为 `"lr"`.
+- **`-d, --dataset-size`**: 数据集的大小。如果指定，`build_dataset` 将被跳过并使用这个大小作为数据集大小，默认使用 `build_dataset` 所得数据集的大小。
+- **`-n, --ngpus`**: 使用 GPU 的数量, 默认为1。
+- **`-s, --save-path`**: 保存的可视化图片的路径，默认不保存。
+- `--title`: 可视化图片的标题，默认为配置文件名。
+- `--style`: 可视化图片的风格，默认为 `whitegrid`。
 - `--window-size`: 可视化窗口大小，如果没有指定，默认为 `12*7`。如果需要指定，按照格式 `'W*H'`。
- `--cfg-options` : 对配置文件的修改，参考[教程 1：如何编写配置文件](https://mmclassification.readthedocs.io/zh_CN/latest/tutorials/config.html)。
+- `--cfg-options`: 对配置文件的修改，参考[学习配置文件](./config.md)。

 ```{note}

-部分数据集在解析标注阶段比较耗时，可直接将 `dataset-size` 指定数据集的大小，以节约时间。
+部分数据集在解析标注阶段比较耗时，可直接将 `-d, dataset-size` 指定数据集的大小，以节约时间。

 ```

 **示例**：

 ```bash
-python tools/visualizations/vis_lr.py configs/resnet/resnet50_b16x8_cifar100.py
+python tools/visualizations/vis_scheduler.py configs/resnet/resnet50_b16x8_cifar100.py
 ```

-<div align=center><img src="../_static/image/tools/visualization/lr_schedule1.png" style=" width: auto; height: 40%; "></div>
+<div align=center><img src="https://user-images.githubusercontent.com/18586273/191006713-023f065d-d366-4165-a52e-36176367506e.png" style=" width: auto; height: 40%; "></div>

 当数据集为 ImageNet 时，通过直接指定数据集大小来节约时间，并保存图片：

 ```bash
-python tools/visualizations/vis_lr.py configs/repvgg/repvgg-B3g4_4xb64-autoaug-lbs-mixup-coslr-200e_in1k.py --dataset-size 1281167 --ngpus 4 --save-path ./repvgg-B3g4_4xb64-lr.jpg
+python tools/visualizations/vis_scheduler.py configs/repvgg/repvgg-B3g4_4xb64-autoaug-lbs-mixup-coslr-200e_in1k.py --dataset-size 1281167 --ngpus 4 --save-path ./repvgg-B3g4_4xb64-lr.jpg
 ```

-<div align=center><img src="../_static/image/tools/visualization/lr_schedule2.png" style=" width: auto; height: 40%; "></div>
+<div align=center><img src="https://user-images.githubusercontent.com/18586273/191006721-0f680e07-355e-4cd6-889c-86c0cad9acb7.png" style=" width: auto; height: 40%; "></div>

 ## 类别激活图可视化

@ -182,7 +187,7 @@ python tools/visualizations/vis_cam.py \
 - `--num-extra-tokens`: `ViT` 类网络的额外的 tokens 通道数，默认使用主干网络的 `num_extra_tokens`。
 - `--aug-smooth`：是否使用测试时增强
 - `--device`：使用的计算设备，如果不设置，默认为'cpu'。
- `--cfg-options`：对配置文件的修改，参考[教程 1：如何编写配置文件](https://mmclassification.readthedocs.io/zh_CN/latest/tutorials/config.html)。
+- `--cfg-options`：对配置文件的修改，参考[学习配置文件](./config.md)。

 ```{note}
 在指定 `--target-layers` 时，如果不知道模型有哪些网络层，可使用命令行添加 `--preview-model` 查看所有网络层名称；
@ -237,7 +242,7 @@ python tools/visualizations/vis_cam.py \
   ```shell
   python tools/visualizations/vis_cam.py \
       demo/dog.jpg  \
-       configs/mobilenet_v3/mobilenet-v3-large_8xb32_in1k.py \
+       configs/mobilenet_v3/mobilenet-v3-large_8xb128_in1k.py \
       https://download.openmmlab.com/mmclassification/v0/mobilenet_v3/convert/mobilenet_v3_large-3ea3c186.pth \
       --target-layers 'backbone.layer16' \
       --method LayerCAM \
--- a/tools/visualizations/browse_dataset.py
+++ b/tools/visualizations/browse_dataset.py
@ -8,10 +8,10 @@ import mmcv
 import numpy as np
 from mmengine.config import Config, DictAction
 from mmengine.dataset import Compose
+from mmengine.utils import ProgressBar
 from mmengine.visualization import Visualizer

 from mmcls.datasets.builder import build_dataset
-from mmcls.registry import VISUALIZERS
 from mmcls.utils import register_all_modules
 from mmcls.visualization import ClsVisualizer
 from mmcls.visualization.cls_visualizer import _get_adaptive_scale
@ -22,12 +22,14 @@ def parse_args():
    parser.add_argument('config', help='train config file path')
    parser.add_argument(
        '--output-dir',
+        '-o',
        default=None,
        type=str,
        help='If there is no display interface, you can save it.')
    parser.add_argument('--not-show', default=False, action='store_true')
    parser.add_argument(
        '--phase',
+        '-p',
        default='train',
        type=str,
        choices=['train', 'test', 'val'],
@ -35,6 +37,7 @@ def parse_args():
        ' Defaults to "train".')
    parser.add_argument(
        '--show-number',
+        '-n',
        type=int,
        default=sys.maxsize,
        help='number of images selected to visualize, must bigger than 0. if '
@ -42,11 +45,13 @@ def parse_args():
        'dataset; default "sys.maxsize", show all images in dataset')
    parser.add_argument(
        '--show-interval',
+        '-i',
        type=float,
        default=2,
        help='the interval of show (s)')
    parser.add_argument(
        '--mode',
+        '-m',
        default='transformed',
        type=str,
        choices=['original', 'transformed', 'concat', 'pipeline'],
@ -58,9 +63,17 @@ def parse_args():
        'Defaults to "transformed".')
    parser.add_argument(
        '--rescale-factor',
+        '-r',
        type=float,
        help='image rescale factor, which is useful if the output is too '
        'large or too small.')
+    parser.add_argument(
+        '--channel-order',
+        '-c',
+        default='BGR',
+        choices=['BGR', 'RGB'],
+        help='The channel order of the showing images, could be "BGR" '
+        'or "RGB", Defaults to "BGR".')
    parser.add_argument(
        '--cfg-options',
        nargs='+',
@ -168,12 +181,13 @@ def main():
                                      intermediate_imgs)

    # init visualizer
-    visualizer: ClsVisualizer = VISUALIZERS.build(cfg.visualizer)
+    cfg.visualizer.pop('type')
+    visualizer = ClsVisualizer(**cfg.visualizer)
    visualizer.dataset_meta = dataset.metainfo

    # init visualization image number
    display_number = min(args.show_number, len(dataset))
-    progress_bar = mmcv.ProgressBar(display_number)
+    progress_bar = ProgressBar(display_number)

    for i, item in zip(range(display_number), dataset):
        rescale_factor = args.rescale_factor
@ -195,11 +209,11 @@ def main():

        intermediate_imgs.clear()

-        data_sample = item['data_sample'].numpy()
+        data_sample = item['data_samples'].numpy()

        # get filename from dataset or just use index as filename
-        if hasattr(item['data_sample'], 'img_path'):
-            filename = osp.basename(item['data_sample'].img_path)
+        if hasattr(item['data_samples'], 'img_path'):
+            filename = osp.basename(item['data_samples'].img_path)
        else:
            # some dataset have not image path
            filename = f'{i}.jpg'
@ -209,7 +223,7 @@ def main():

        visualizer.add_datasample(
            filename,
-            image[..., ::-1],
+            image if args.channel_order == 'RGB' else image[..., ::-1],
            data_sample,
            rescale_factor=rescale_factor,
            show=not args.not_show,
--- a/tools/visualizations/vis_cam.py
+++ b/tools/visualizations/vis_cam.py
@ -8,13 +8,14 @@ from pathlib import Path

 import mmcv
 import numpy as np
-from mmcv import Config, DictAction
-from mmcv.utils import to_2tuple
+from mmcv.transforms import Compose
+from mmengine.config import Config, DictAction
+from mmengine.utils import to_2tuple
 from torch.nn import BatchNorm1d, BatchNorm2d, GroupNorm, LayerNorm

 from mmcls import digit_version
 from mmcls.apis import init_model
-from mmcls.datasets.pipelines import Compose
+from mmcls.utils import register_all_modules

 try:
    from pytorch_grad_cam import (EigenCAM, EigenGradCAM, GradCAM,
@ -26,9 +27,6 @@ except ImportError:
    raise ImportError('Please run `pip install "grad-cam>=1.3.6"` to install '
                      '3rd party package pytorch_grad_cam.')

-# set of transforms, which just change data format, not change the pictures
-FORMAT_TRANSFORMS_SET = {'ToTensor', 'Normalize', 'ImageToTensor', 'Collect'}
-
 # Supported grad-cam type map
 METHOD_MAP = {
    'gradcam': GradCAM,
@ -159,56 +157,16 @@ def build_reshape_transform(model, args):
    return _reshape_transform


-def apply_transforms(img_path, pipeline_cfg):
-    """Apply transforms pipeline and get both formatted data and the image
-    without formatting."""
-    data = dict(img_info=dict(filename=img_path), img_prefix=None)
-
-    def split_pipeline_cfg(pipeline_cfg):
-        """to split the transfoms into image_transforms and
-        format_transforms."""
-        image_transforms_cfg, format_transforms_cfg = [], []
-        if pipeline_cfg[0]['type'] != 'LoadImageFromFile':
-            pipeline_cfg.insert(0, dict(type='LoadImageFromFile'))
-        for transform in pipeline_cfg:
-            if transform['type'] in FORMAT_TRANSFORMS_SET:
-                format_transforms_cfg.append(transform)
-            else:
-                image_transforms_cfg.append(transform)
-        return image_transforms_cfg, format_transforms_cfg
-
-    image_transforms, format_transforms = split_pipeline_cfg(pipeline_cfg)
-    image_transforms = Compose(image_transforms)
-    format_transforms = Compose(format_transforms)
-
-    intermediate_data = image_transforms(data)
-    inference_img = copy.deepcopy(intermediate_data['img'])
-    format_data = format_transforms(intermediate_data)
-
-    return format_data, inference_img
-
-
-class MMActivationsAndGradients(ActivationsAndGradients):
-    """Activations and gradients manager for mmcls models."""
-
-    def __call__(self, x):
-        self.gradients = []
-        self.activations = []
-        return self.model(
-            x, return_loss=False, softmax=False, post_process=False)
-
-
 def init_cam(method, model, target_layers, use_cuda, reshape_transform):
    """Construct the CAM object once, In order to be compatible with mmcls,
    here we modify the ActivationsAndGradients object."""
-
    GradCAM_Class = METHOD_MAP[method.lower()]
    cam = GradCAM_Class(
        model=model, target_layers=target_layers, use_cuda=use_cuda)
    # Release the original hooks in ActivationsAndGradients to use
-    # MMActivationsAndGradients.
+    # ActivationsAndGradients.
    cam.activations_and_grads.release()
-    cam.activations_and_grads = MMActivationsAndGradients(
+    cam.activations_and_grads = ActivationsAndGradients(
        cam.model, cam.target_layers, reshape_transform)

    return cam
@ -306,6 +264,7 @@ def main():
    if args.cfg_options is not None:
        cfg.merge_from_dict(args.cfg_options)

+    register_all_modules()
    # build the model from a config file and a checkpoint file
    model = init_model(cfg, args.checkpoint, device=args.device)
    if args.preview_model:
@ -314,7 +273,10 @@ def main():
        return

    # apply transform and perpare data
-    data, src_img = apply_transforms(args.img, cfg.data.test.pipeline)
+    transforms = Compose(cfg.test_dataloader.dataset.pipeline)
+    data = transforms({'img_path': args.img})
+    src_img = copy.deepcopy(data['inputs']).numpy().transpose(1, 2, 0)
+    data = model.data_preprocessor(data, False)

    # build target layers
    if args.target_layers:
@ -344,7 +306,7 @@ def main():

    # calculate cam grads and show|save the visualization image
    grayscale_cam = cam(
-        data['img'].unsqueeze(0),
+        data['inputs'].unsqueeze(0),
        targets,
        eigen_smooth=args.eigen_smooth,
        aug_smooth=args.aug_smooth)
--- a/tools/visualizations/vis_scheduler.py
+++ b/tools/visualizations/vis_scheduler.py
@ -76,22 +76,30 @@ def parse_args():
        description='Visualize a Dataset Pipeline')
    parser.add_argument('config', help='config file path')
    parser.add_argument(
-        '--param',
+        '-p',
+        '--parameter',
        type=str,
        default='lr',
        choices=['lr', 'momentum'],
-        help='The param to visualize its change curve, choose from'
+        help='The parameter to visualize its change curve, choose from'
        '"lr" and "momentum". Defaults to "lr".')
    parser.add_argument(
+        '-d',
        '--dataset-size',
        type=int,
        help='The size of the dataset. If specify, `build_dataset` will '
        'be skipped and use this size as the dataset size.')
    parser.add_argument(
+        '-n',
        '--ngpus',
        type=int,
        default=1,
        help='The number of GPUs used in training.')
+    parser.add_argument(
+        '-s',
+        '--save-path',
+        type=Path,
+        help='The learning rate curve plot save path')
    parser.add_argument(
        '--log-level',
        default='WARNING',
@ -100,10 +108,6 @@ def parse_args():
    parser.add_argument('--title', type=str, help='title of figure')
    parser.add_argument(
        '--style', type=str, default='whitegrid', help='style of plt')
-    parser.add_argument(
-        '--save-path',
-        type=Path,
-        help='The learning rate curve plot save path')
    parser.add_argument('--not-show', default=False, action='store_true')
    parser.add_argument(
        '--window-size',
@ -166,6 +170,7 @@ def simulate_train(data_loader, cfg, by_epoch):
    param_record_hook = ParamRecordHook(by_epoch=by_epoch)
    default_hooks = dict(
        param_scheduler=cfg.default_hooks['param_scheduler'],
+        runtime_info=None,
        timer=None,
        logger=None,
        checkpoint=None,
@ -246,12 +251,12 @@ def main():

    # simulation training process
    lr_list, momentum_list = simulate_train(data_loader, cfg, by_epoch)
-    if args.param == 'lr':
+    if args.parameter == 'lr':
        param_list = lr_list
    else:
        param_list = momentum_list

-    param_name = 'Learning Rate' if args.param == 'lr' else 'Momentum'
+    param_name = 'Learning Rate' if args.parameter == 'lr' else 'Momentum'
    plot_curve(param_list, args, param_name, len(data_loader), by_epoch)

    if args.save_path: