[Docs] Add t-SNE visualization doc (#1555)

* 2023-05-08 add t-sne docs * 2023-05-08 add t-sne docs * 2023-05-10 add t-sne docs CN * 2023-05-25 rebase dev * add docs for running t-sne on mae models, and fix a bug in vis_tsne.py * rewrite t-sne docs to correct some mistakes
2023-06-01 10:04:06 +08:00 · 2023-06-01 10:04:06 +08:00 · 795607cfeb
parent 5bd088ef43
commit 795607cfeb
5 changed files with 175 additions and 0 deletions
--- a/docs/en/index.rst
+++ b/docs/en/index.rst
@ -92,6 +92,7 @@ We always welcome *PRs* and *Issues* for the betterment of MMPretrain.
   useful_tools/dataset_visualization.md
   useful_tools/scheduler_visualization.md
   useful_tools/cam_visualization.md
+   useful_tools/t-sne_visualization.md

 .. _Analysis:
 .. toctree::
--- a/docs/en/useful_tools/t-sne_visualization.md
+++ b/docs/en/useful_tools/t-sne_visualization.md
@ -0,0 +1,85 @@
+# t-Distributed Stochastic Neighbor Embedding (t-SNE) Visualization
+
+## Introduction of the t-SNE visualization tool
+
+MMPretrain provides `tools/visualization/vis_tsne.py` tool to visualize the feature embeddings of images by t-SNE. Please install `sklearn` to calculate t-SNE by `pip install scikit-learn`.
+
+**Command**：
+
+```bash
+python tools/visualization/vis_tsne.py \
+    CONFIG \
+    [--checkpoint CHECKPOINT] \
+    [--work-dir WORK_DIR] \
+    [--test-cfg TEST_CFG] \
+    [--vis-stage {backbone,neck,pre_logits}]
+    [--class-idx ${CLASS_IDX} [CLASS_IDX ...]]
+    [--max-num-class MAX_NUM_CLASS]
+    [--max-num-samples MAX_NUM_SAMPLES]
+    [--cfg-options CFG_OPTIONS [CFG_OPTIONS ...]]
+    [--device DEVICE]
+    [--legend]
+    [--show]
+    [--n-components N_COMPONENTS]
+    [--perplexity PERPLEXITY]
+    [--early-exaggeration EARLY_EXAGGERATION]
+    [--learning-rate LEARNING_RATE]
+    [--n-iter N_ITER]
+    [--n-iter-without-progress N_ITER_WITHOUT_PROGRESS]
+    [--init INIT]
+```
+
+**Description of all arguments**：
+
+- `CONFIG`: The path of t-SNE config file.
+- `--checkpoint CHECKPOINT`: The path of the checkpoint file.
+- `--work-dir WORK_DIR`: The directory to save logs and visualization images.
+- `--test-cfg TEST_CFG`: The path of t-SNE config file to load config of test dataloader.
+- `--vis-stage {backbone,neck,pre_logits}`: The visualization stage of the model.
+- `--class-idx CLASS_IDX [CLASS_IDX ...]`: The categories used to calculate t-SNE.
+- `--max-num-class MAX_NUM_CLASS`: The first N categories to apply t-SNE algorithms. Defaults to 20.
+- `--max-num-samples MAX_NUM_SAMPLES`: The maximum number of samples per category. Higher number need longer time to calculate. Defaults to 100.
+- `--cfg-options CFG_OPTIONS [CFG_OPTIONS ...]`: override some settings in the used config, the key-value pair in xxx=yyy format will be merged into config file. If the value to be overwritten is a list, it should be like key="[a,b]" or key=a,b It also allows nested list/tuple values, e.g. key="[(a,b),(c,d)]" Note that the quotation marks are necessary and that no white space is allowed.
+- `--device DEVICE`: Device used for inference.
+- `--legend`: Show the legend of all categories.
+- `--show`: Display the result in a graphical window.
+- `--n-components N_COMPONENTS`: The dimension of results.
+- `--perplexity PERPLEXITY`: The perplexity is related to the number of nearest neighbors that is used in other manifold learning algorithms.
+- `--early-exaggeration EARLY_EXAGGERATION`: Controls how tight natural clusters in the original space are in the embedded space and how much space will be between them.
+- `--learning-rate LEARNING_RATE`: The learning rate for t-SNE is usually in the range[10.0, 1000.0]. If the learning rate is too high, the data may looklike a ball with any point approximately equidistant from its nearestneighbours. If the learning rate is too low, most points may lookcompressed in a dense cloud with few outliers.
+- `--n-iter N_ITER`: Maximum number of iterations for the optimization. Should be at least 250.
+- `--n-iter-without-progress N_ITER_WITHOUT_PROGRESS`: Maximum number of iterations without progress before we abort the optimization.
+- `--init INIT`: The init method.
+
+## How to visualize the t-SNE of a image classifier (such as ResNet)
+
+Here are two examples of running t-SNE visualization on ResNet-18 and ResNet-50 models, trained on CIFAR-10 dataset:
+
+```shell
+python tools/visualization/vis_tsne.py \
+    configs/resnet/resnet18_8xb16_cifar10.py \
+    --checkpoint https://download.openmmlab.com/mmclassification/v0/resnet/resnet18_b16x8_cifar10_20210528-bd6371c8.pth
+
+python tools/visualization/vis_tsne.py \
+    configs/resnet/resnet50_8xb16_cifar10.py \
+    --checkpoint https://download.openmmlab.com/mmclassification/v0/resnet/resnet50_b16x8_cifar10_20210528-f54bfad9.pth
+```
+
+| ResNet-18                                                                                            | ResNet-50                                                                                            |
+| ---------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------- |
+| <div align=center><img src='https://user-images.githubusercontent.com/42371271/236410521-c4d087da-d16f-48ad-b951-c74d10c68f33.png' height="auto" width="auto" ></div> | <div align=center><img src='https://user-images.githubusercontent.com/42371271/236411844-c97dc514-dad0-401e-ba8f-307d0a385b4e.png' height="auto" width="auto" ></div> |
+
+## How to visualize the t-SNE of a self-supervised model (such as MAE)
+
+Here is an example of running t-SNE visualization on MAE-ViT-base model, trained on ImageNet dataset. The input data is from ImageNet validation set. MAE and some self-supervised pre-training algorithms do not have test_dataloader information. When analyzing such self-supervised algorithms, you need to add test_dataloader information in the config, or you can use '--test-cfg' argument to specify a config file.
+
+```shell
+python tools/visualization/vis_tsne.py \
+    configs/mae/mae_vit-base-p16_8xb512-amp-coslr-800e_in1k.py \
+    --checkpoint https://download.openmmlab.com/mmselfsup/1.x/mae/mae_vit-base-p16_8xb512-fp16-coslr-800e_in1k/mae_vit-base-p16_8xb512-coslr-800e-fp16_in1k_20220825-5d81fbc4.pth \
+    --test-cfg configs/_base_/datasets/imagenet_bs32.py
+```
+
+| MAE-ViT-base                                                                                                                                                  |
+| ------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| <div align=center><img src='https://github.com/open-mmlab/mmpretrain/assets/42371271/ee576c0c-abef-43d1-8866-24a5f5fd0cf6' height="auto" width="auto" ></div> |
--- a/docs/zh_CN/index.rst
+++ b/docs/zh_CN/index.rst
@ -78,6 +78,7 @@ MMPretrain 上手路线
   useful_tools/dataset_visualization.md
   useful_tools/scheduler_visualization.md
   useful_tools/cam_visualization.md
+   useful_tools/t-sne_visualization.md

 .. _分析工具:
 .. toctree::
--- a/docs/zh_CN/useful_tools/t-sne_visualization.md
+++ b/docs/zh_CN/useful_tools/t-sne_visualization.md
@ -0,0 +1,85 @@
+# t-分布随机邻域嵌入（t-SNE）可视化
+
+## t-分布随机邻域嵌入可视化工具介绍
+
+MMPretrain 提供 `tools/visualization/vis_tsne.py` 工具来用t-SNE可视化图像的特征嵌入。请使用 `pip install scikit-learn` 安装 `sklearn` 来计算t-SNE。
+
+**命令**：
+
+```bash
+python tools/visualization/vis_tsne.py \
+    CONFIG \
+    [--checkpoint CHECKPOINT] \
+    [--work-dir WORK_DIR] \
+    [--test-cfg TEST_CFG] \
+    [--vis-stage {backbone,neck,pre_logits}]
+    [--class-idx ${CLASS_IDX} [CLASS_IDX ...]]
+    [--max-num-class MAX_NUM_CLASS]
+    [--max-num-samples MAX_NUM_SAMPLES]
+    [--cfg-options CFG_OPTIONS [CFG_OPTIONS ...]]
+    [--device DEVICE]
+    [--legend]
+    [--show]
+    [--n-components N_COMPONENTS]
+    [--perplexity PERPLEXITY]
+    [--early-exaggeration EARLY_EXAGGERATION]
+    [--learning-rate LEARNING_RATE]
+    [--n-iter N_ITER]
+    [--n-iter-without-progress N_ITER_WITHOUT_PROGRESS]
+    [--init INIT]
+```
+
+**所有参数的说明**：
+
+- `CONFIG`: t-SNE 配置文件的路径。
+- `--checkpoint CHECKPOINT`: 模型权重文件的路径。
+- `--work-dir WORK_DIR`: 保存日志和可视化图像的目录。
+- `--test-cfg TEST_CFG`: 用来加载 test_dataloader 配置的 t-SNE 配置文件的路径。
+- `--vis-stage {backbone,neck,pre_logits}`: 模型可视化的阶段。
+- `--class-idx CLASS_IDX [CLASS_IDX ...]`: 用来计算 t-SNE 的类别。
+- `--max-num-class MAX_NUM_CLASS`: 前 N 个被应用 t-SNE 算法的类别，默认为20。
+- `--max-num-samples MAX_NUM_SAMPLES`: 每个类别中最大的样本数，值越高需要的计算时间越长，默认为100。
+- `--cfg-options CFG_OPTIONS [CFG_OPTIONS ...]`: 覆盖被使用的配置中的一些设定，形如 xxx=yyy 格式的关键字-值对会被合并到配置文件中。如果被覆盖的值是一个列表，它应该形如 key="[a,b]" 或者 key=a,b 。它还允许嵌套的列表/元组值，例如 key="[(a,b),(c,d)]" 。注意引号是必需的，而且不允许有空格。
+- `--device DEVICE`: 用于推理的设备。
+- `--legend`: 显示所有类别的图例。
+- `--show`: 在图形窗口中显示结果。
+- `--n-components N_COMPONENTS`: 结果的维数。
+- `--perplexity PERPLEXITY`: 复杂度与其他流形学习算法中使用的最近邻的数量有关。
+- `--early-exaggeration EARLY_EXAGGERATION`: 控制原空间中的自然聚类在嵌入空间中的紧密程度以及它们之间的空间大小。
+- `--learning-rate LEARNING_RATE`: t-SNE 的学习率通常在[10.0, 1000.0]的范围内。如果学习率太高，数据可能看起来像一个球，其中任何一点与它最近的邻居近似等距。如果学习率太低，大多数点可能看起来被压缩在一个几乎没有异常值的密集点云中。
+- `--n-iter N_ITER`: 优化的最大迭代次数。应该至少为250。
+- `--n-iter-without-progress N_ITER_WITHOUT_PROGRESS`: 在我们中止优化之前，最大的没有进展的迭代次数。
+- `--init INIT`: 初始化方法。
+
+## 如何可视化分类模型的t-SNE（如 ResNet）
+
+以下是在CIFAR-10数据集上训练的 ResNet-18 和 ResNet-50 模型上运行 t-SNE 可视化的两个样例：
+
+```shell
+python tools/visualization/vis_tsne.py \
+    configs/resnet/resnet18_8xb16_cifar10.py \
+    --checkpoint https://download.openmmlab.com/mmclassification/v0/resnet/resnet18_b16x8_cifar10_20210528-bd6371c8.pth
+
+python tools/visualization/vis_tsne.py \
+    configs/resnet/resnet50_8xb16_cifar10.py \
+    --checkpoint https://download.openmmlab.com/mmclassification/v0/resnet/resnet50_b16x8_cifar10_20210528-f54bfad9.pth
+```
+
+| ResNet-18                                                                                            | ResNet-50                                                                                            |
+| ---------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------- |
+| <div align=center><img src='https://user-images.githubusercontent.com/42371271/236410521-c4d087da-d16f-48ad-b951-c74d10c68f33.png' height="auto" width="auto" ></div> | <div align=center><img src='https://user-images.githubusercontent.com/42371271/236411844-c97dc514-dad0-401e-ba8f-307d0a385b4e.png' height="auto" width="auto" ></div> |
+
+## 如何可视化自监督模型的t-SNE（如 MAE）
+
+以下是在ImageNet数据集上训练的 MAE-ViT-base 模型上运行 t-SNE 可视化的一个样例。输入数据来自 ImageNet 验证集。MAE和一些自监督预训练算法配置中没有 test_dataloader 信息。在分析这些自监督算法时，你需要在配置中添加 test_dataloader 信息，或者使用 `--test-cfg` 字段来指定一个配置文件。
+
+```shell
+python tools/visualization/vis_tsne.py \
+    configs/mae/mae_vit-base-p16_8xb512-amp-coslr-800e_in1k.py \
+    --checkpoint https://download.openmmlab.com/mmselfsup/1.x/mae/mae_vit-base-p16_8xb512-fp16-coslr-800e_in1k/mae_vit-base-p16_8xb512-coslr-800e-fp16_in1k_20220825-5d81fbc4.pth \
+    --test-cfg configs/_base_/datasets/imagenet_bs32.py
+```
+
+| MAE-ViT-base                                                                                                                                                  |
+| ------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| <div align=center><img src='https://github.com/open-mmlab/mmpretrain/assets/42371271/ee576c0c-abef-43d1-8866-24a5f5fd0cf6' height="auto" width="auto" ></div> |
--- a/tools/visualization/vis_tsne.py
+++ b/tools/visualization/vis_tsne.py
@ -212,6 +212,9 @@ def main():
                    F.adaptive_avg_pool2d(inputs, 1).squeeze()
                    for inputs in batch_features
                ]
+            elif batch_features[0].ndim == 3:
+                # For (N, L, C) feature
+                batch_features = [inputs.mean(1) for inputs in batch_features]

        # save batch features
        features.append(batch_features)