[Refactor] Inference and train tutorials for 1.x (#1952)

* inference, train_test tutorial * fix * add --work-dir
2025-06-03 22:03:48 +08:00 · 2022-08-29 19:47:03 +08:00 · 2022-08-29 19:47:03 +08:00 · ae5c13e927
commit ae5c13e927
parent 99a8f59b70
2 changed files with 208 additions and 200 deletions
--- a/docs/en/user_guides/3_inference.md
+++ b/docs/en/user_guides/3_inference.md
@ -1,131 +1,25 @@
-## Inference with pretrained models
+# Tutorial 3: Inference with existing models

-We provide testing scripts to evaluate a whole dataset (Cityscapes, PASCAL VOC, ADE20k, etc.),
-and also some high-level apis for easier integration to other projects.
+MMSegmentation provides pre-trained models for semantic segmentation in [Model Zoo](../model_zoo.md), and supports multiple standard datasets, including Cityscapes, ADE20K, etc.
+This note will show how to use existing models to inference on given images.
+As for how to test existing models on standard datasets, please see this [guide](./4_train_test.md#Test-models-on-standard-datasets)

-### Test a dataset
+## Inference on given images

- single GPU
- CPU
- single node multiple GPU
- multiple node
+MMSegmentation provides high-level Python APIs for inference on images. Here is an example of building the model and inference on given images.
+Please download the [pre-trained model](https://download.openmmlab.com/mmsegmentation/v0.5/pspnet/pspnet_r50-d8_512x1024_80k_cityscapes/pspnet_r50-d8_512x1024_80k_cityscapes_20200606_112131-2376f12b.pth) to the path specified by `checkpoint_file` first.

-You can use the following commands to test a dataset.
-
-```shell
-# single-gpu testing
-python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}] [--show]
-
-# CPU: If GPU unavailable, directly running single-gpu testing command above
-# CPU: If GPU available, disable GPUs and run single-gpu testing script
-export CUDA_VISIBLE_DEVICES=-1
-python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}] [--show]
-
-# multi-gpu testing
-./tools/dist_test.sh ${CONFIG_FILE} ${CHECKPOINT_FILE} ${GPU_NUM} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}]
+```python
+from mmseg.apis import init_model, inference_model
+from mmsegseg.utils import register_all_modules
+# Specify the path to model config and checkpoint file
+config_file = 'configs/pspnet/pspnet_r50-d8_512x1024_80k_cityscapes.py'
+checkpoint_file = 'checkpoints/pspnet_r50-d8_512x1024_80k_cityscapes_20200606_112131-2376f12b.pth'
+# register all modules in mmseg into the registries
+register_all_modules()
+# build the model from a config file and a checkpoint file
+model = init_model(config_file, checkpoint_file, device='cuda:0')
+# test image pair, and save the results
+img = 'demo/demo.png'
+result = inference_model(model, img)
 ```
-
-Optional arguments:
-
- `RESULT_FILE`: Filename of the output results in pickle format. If not specified, the results will not be saved to a file. (After mmseg v0.17, the output results become pre-evaluation results or format result paths)
- `EVAL_METRICS`: Items to be evaluated on the results. Allowed values depend on the dataset, e.g., `mIoU` is available for all dataset. Cityscapes could be evaluated by `cityscapes` as well as standard `mIoU` metrics.
- `--show`: If specified, segmentation results will be plotted on the images and shown in a new window. It is only applicable to single GPU testing and used for debugging and visualization. Please make sure that GUI is available in your environment, otherwise you may encounter the error like `cannot connect to X server`.
- `--show-dir`: If specified, segmentation results will be plotted on the images and saved to the specified directory. It is only applicable to single GPU testing and used for debugging and visualization. You do NOT need a GUI available in your environment for using this option.
- `--eval-options`: Optional parameters for `dataset.format_results` and `dataset.evaluate` during evaluation. When `efficient_test=True`, it will save intermediate results to local files to save CPU memory. Make sure that you have enough local storage space (more than 20GB). (`efficient_test` argument does not have effect after mmseg v0.17, we use a progressive mode to evaluation and format results which can largely save memory cost and evaluation time.)
-
-Examples:
-
-Assume that you have already downloaded the checkpoints to the directory `checkpoints/`.
-
-1. Test PSPNet and visualize the results. Press any key for the next image.
-
-   ```shell
-   python tools/test.py configs/pspnet/pspnet_r50-d8_512x1024_40k_cityscapes.py \
-       checkpoints/pspnet_r50-d8_512x1024_40k_cityscapes_20200605_003338-2966598c.pth \
-       --show
-   ```
-
-2. Test PSPNet and save the painted images for latter visualization.
-
-   ```shell
-   python tools/test.py configs/pspnet/pspnet_r50-d8_512x1024_40k_cityscapes.py \
-       checkpoints/pspnet_r50-d8_512x1024_40k_cityscapes_20200605_003338-2966598c.pth \
-       --show-dir psp_r50_512x1024_40ki_cityscapes_results
-   ```
-
-3. Test PSPNet on PASCAL VOC (without saving the test results) and evaluate the mIoU.
-
-   ```shell
-   python tools/test.py configs/pspnet/pspnet_r50-d8_512x1024_20k_voc12aug.py \
-       checkpoints/pspnet_r50-d8_512x1024_20k_voc12aug_20200605_003338-c57ef100.pth \
-       --eval mAP
-   ```
-
-4. Test PSPNet with 4 GPUs, and evaluate the standard mIoU and cityscapes metric.
-
-   ```shell
-   ./tools/dist_test.sh configs/pspnet/pspnet_r50-d8_512x1024_40k_cityscapes.py \
-       checkpoints/pspnet_r50-d8_512x1024_40k_cityscapes_20200605_003338-2966598c.pth \
-       4 --out results.pkl --eval mIoU cityscapes
-   ```
-
-:::{note}
-There is some gap (~0.1%) between cityscapes mIoU and our mIoU. The reason is that cityscapes average each class with class size by default.
-We use the simple version without average for all datasets.
-:::
-
-5. Test PSPNet on cityscapes test split with 4 GPUs, and generate the png files to be submit to the official evaluation server.
-
-   First, add following to config file `configs/pspnet/pspnet_r50-d8_512x1024_40k_cityscapes.py`,
-
-   ```python
-   data = dict(
-       test=dict(
-           img_dir='leftImg8bit/test',
-           ann_dir='gtFine/test'))
-   ```
-
-   Then run test.
-
-   ```shell
-   ./tools/dist_test.sh configs/pspnet/pspnet_r50-d8_512x1024_40k_cityscapes.py \
-       checkpoints/pspnet_r50-d8_512x1024_40k_cityscapes_20200605_003338-2966598c.pth \
-       4 --format-only --eval-options "imgfile_prefix=./pspnet_test_results"
-   ```
-
-   You will get png files under `./pspnet_test_results` directory.
-   You may run `zip -r results.zip pspnet_test_results/` and submit the zip file to [evaluation server](https://www.cityscapes-dataset.com/submit/).
-
-6. CPU memory efficient test DeeplabV3+ on Cityscapes (without saving the test results) and evaluate the mIoU.
-
-   ```shell
-   python tools/test.py \
-   configs/deeplabv3plus/deeplabv3plus_r18-d8_512x1024_80k_cityscapes.py \
-   deeplabv3plus_r18-d8_512x1024_80k_cityscapes_20201226_080942-cff257fe.pth \
-   --eval-options efficient_test=True \
-   --eval mIoU
-   ```
-
-   Using `pmap` to view CPU memory footprint, it used 2.25GB CPU memory with `efficient_test=True` and 11.06GB CPU memory with `efficient_test=False` . This optional parameter can save a lot of memory. (After mmseg v0.17, efficient_test has not effect and we use a progressive mode to evaluation and format results efficiently by default.)
-
-7. Test PSPNet on LoveDA test split with 1 GPU, and generate the png files to be submit to the official evaluation server.
-
-   First, add following to config file `configs/pspnet/pspnet_r50-d8_512x512_80k_loveda.py`,
-
-   ```python
-   data = dict(
-       test=dict(
-           img_dir='img_dir/test',
-           ann_dir='ann_dir/test'))
-   ```
-
-   Then run test.
-
-   ```shell
-   python ./tools/test.py configs/pspnet/pspnet_r50-d8_512x512_80k_loveda.py \
-       checkpoints/pspnet_r50-d8_512x512_80k_loveda_20211104_155728-88610f9f.pth \
-       --format-only --eval-options "imgfile_prefix=./pspnet_test_results"
-   ```
-
-   You will get png files under `./pspnet_test_results` directory.
-   You may run `zip -r -j Results.zip pspnet_test_results/` and submit the zip file to [evaluation server](https://codalab.lisn.upsaclay.fr/competitions/421).
--- a/docs/en/user_guides/4_train_test.md
+++ b/docs/en/user_guides/4_train_test.md
@ -1,41 +1,73 @@
-## Train a model
+# Tutorial 4: Train and test with existing models

-MMSegmentation implements distributed training and non-distributed training,
-which uses `MMDistributedDataParallel` and `MMDataParallel` respectively.
+This tutorial provides instruction for users to use the models provided in the [Model Zoo](../model_zoo.md) for other datasets to obtain better performance.
+MMSegmentation also provides out-of-the-box tools for training models.
+This section will show how to train and test models on standard datasets.

-All outputs (log files and checkpoints) will be saved to the working directory,
-which is specified by `work_dir` in the config file.
+## Train models on standard datasets

-By default we evaluate the model on the validation set after some iterations, you can change the evaluation interval by adding the interval argument in the training config.
+### Modify training schedule
+
+Modify the following configuration to customize the training.

 ```python
-evaluation = dict(interval=4000)  # This evaluate the model per 4000 iterations.
+# training schedule for 40k
+train_cfg = dict(type='IterBasedTrainLoop', max_iters=40000, val_interval=4000)
+val_cfg = dict(type='ValLoop')
+test_cfg = dict(type='TestLoop')
+# optimizer
+optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0005)
+optim_wrapper = dict(type='OptimWrapper', optimizer=optimizer, clip_grad=None)
+# learning policy
+param_scheduler = [
+    dict(
+        type='PolyLR',
+        eta_min=1e-4,
+        power=0.9,
+        begin=0,
+        end=40000,
+        by_epoch=False)
+# basic hooks
+default_hooks = dict(
+    timer=dict(type='IterTimerHook'),
+    logger=dict(type='LoggerHook', interval=50, log_metric_by_epoch=False),
+    param_scheduler=dict(type='ParamSchedulerHook'),
+    checkpoint=dict(type='CheckpointHook', by_epoch=False, interval=4000),
+    sampler_seed=dict(type='DistSamplerSeedHook'))
 ```

-**\*Important\***: The default learning rate in config files is for 4 GPUs and 2 img/gpu (batch size = 4x2 = 8).
-Equivalently, you may also use 8 GPUs and 1 imgs/gpu since all models using cross-GPU SyncBN.
+### Use pre-trained model

-To trade speed with GPU memory, you may pass in `--cfg-options model.backbone.with_cp=True` to enable checkpoint in backbone.
+Users can load a pre-trained model by setting the `load_from` field of the config to the model's path or link.
+The users might need to download the model weights before training to avoid the download time during training.

-### Train on a single machine
-
-#### Train with a single GPU
-
-official support:
-
-```shell
-sh tools/dist_train.sh ${CONFIG_FILE} 1 [optional arguments]
+```python
+# use the pre-trained model for the whole PSPNet
+load_from = 'https://download.openmmlab.com/mmsegmentation/v0.5/pspnet/pspnet_r50-d8_512x1024_40k_cityscapes/pspnet_r50-d8_512x1024_40k_cityscapes_20200605_003338-2966598c.pth'  # model path can be found in model zoo
 ```

-experimental support (Convert SyncBN to BN):
+### Training on a single GPU

-```shell
-python tools/train.py ${CONFIG_FILE} [optional arguments]
-```
+We provide `tools/train.py` to launch training jobs on a single GPU.
+The basic usage is as follows.
+This tool accepts several optional arguments, including:

-If you want to specify the working directory in the command, you can add an argument `--work-dir ${YOUR_WORK_DIR}`.
+- `--work-dir ${WORK_DIR}`: Override the working directory.
+- `--amp`: Use auto mixed precision training.
+- `--resume ${CHECKPOINT_FILE}`: Resume from a previous checkpoint file.
+- `--cfg-options ${OVERRIDE_CONFIGS}`: Override some settings in the used config, the key-value pair in xxx=yyy format will be merged into config file.
+  For example, '--cfg-option model.encoder.in_channels=6'. Please see this [guide](./1_config.md#Modify-config-through-script-arguments) for more details.
+  Below is the optional arguments for multi-gpu test:
+- `--launcher`: Items for distributed job initialization launcher. Allowed choices are `none`, `pytorch`, `slurm`, `mpi`. Especially, if set to none, it will test in a non-distributed mode.
+- `--local_rank`: ID for local rank. If not specified, it will be set to 0.
+  **Note**:
+  Difference between `--resume` and `load-from`:
+  `--resume` loads both the model weights and optimizer status, and the iteration is also inherited from the specified checkpoint.
+  It is usually used for resuming the training process that is interrupted accidentally.

-#### Train with CPU
+`load-from` only loads the model weights and the training iteration starts from 0. It is usually used for fine-tuning.
+
+### Training on CPU

 The process of training on the CPU is consistent with single GPU training if machine does not have GPU. If it has GPUs but not wanting to use it, we just need to disable GPUs before the training process.

@ -43,37 +75,33 @@ The process of training on the CPU is consistent with single GPU training if mac
 export CUDA_VISIBLE_DEVICES=-1
 ```

-And then run the script [above](#train-with-a-single-gpu).
+And then run the script [above](#training-on-a-single-GPU).

 ```{warning}
 The process of training on the CPU is consistent with single GPU training. We just need to disable GPUs before the training process.
 ```

-#### Train with multiple GPUs
+### Training on multiple GPUs
+
+MMSegmentation implements **distributed** training with `MMDistributedDataParallel`.
+We provide `tools/dist_train.sh` to launch training on multiple GPUs.
+The basic usage is as follows.

 ```shell
-sh tools/dist_train.sh ${CONFIG_FILE} ${GPU_NUM} [optional arguments]
+sh tools/dist_train.sh \
+    ${CONFIG_FILE} \
+    ${GPU_NUM} \
+    [optional arguments]
 ```

-Optional arguments are:
-
- `--no-validate` (**not suggested**): By default, the codebase will perform evaluation at every k iterations during the training. To disable this behavior, use `--no-validate`.
- `--work-dir ${WORK_DIR}`: Override the working directory specified in the config file.
- `--resume-from ${CHECKPOINT_FILE}`: Resume from a previous checkpoint file (to continue the training process).
- `--load-from ${CHECKPOINT_FILE}`: Load weights from a checkpoint file (to start finetuning for another task).
- `--deterministic`: Switch on "deterministic" mode which slows down training but the results are reproducible.
-
-Difference between `resume-from` and `load-from`:
-
- `resume-from` loads both the model weights and optimizer state including the iteration number.
- `load-from` loads only the model weights, starts the training from iteration 0.
-
+Optional arguments remain the same as stated [above](#training-on-a-single-gpu)
+and has additional arguments to specify the number of GPUs.
 An example:

 ```shell
 # checkpoints and logs saved in WORK_DIR=work_dirs/pspnet_r50-d8_512x512_80k_ade20k/
 # If work_dir is not set, it will be generated automatically.
-sh tools/dist_train.sh configs/pspnet/pspnet_r50-d8_512x512_80k_ade20k.py 8 --work_dir work_dirs/pspnet_r50-d8_512x512_80k_ade20k/ --deterministic
+sh tools/dist_train.sh configs/pspnet/pspnet_r50-d8_512x512_80k_ade20k.py 8 --work-dir work_dirs/pspnet_r50-d8_512x512_80k_ade20k
 ```

 **Note**: During training, checkpoints and logs are saved in the same folder structure as the config file under `work_dirs/`. Custom work directory is not recommended since evaluation scripts infer work directories from the config file name. If you want to save your weights somewhere else, please use symlink, for example:
@ -85,7 +113,6 @@ ln -s ${YOUR_WORK_DIRS} ${MMSEG}/work_dirs
 #### Launch multiple jobs on a single machine

 If you launch multiple jobs on a single machine, e.g., 2 jobs of 4-GPU training on a machine with 8 GPUs, you need to specify different ports (29500 by default) for each job to avoid communication conflict. Otherwise, there will be error message saying `RuntimeError: Address already in use`.
-
 If you use `dist_train.sh` to launch training jobs, you can set the port in commands with environment variable `PORT`.

 ```shell
@ -93,77 +120,164 @@ CUDA_VISIBLE_DEVICES=0,1,2,3 PORT=29500 sh tools/dist_train.sh ${CONFIG_FILE} 4
 CUDA_VISIBLE_DEVICES=4,5,6,7 PORT=29501 sh tools/dist_train.sh ${CONFIG_FILE} 4
 ```

-### Train with multiple machines
+### Training on multiple nodes
+
+MMSegmentation relies on `torch.distributed` package for distributed training.
+Thus, as a basic usage, one can launch distributed training via PyTorch's [launch utility](https://pytorch.org/docs/stable/distributed.html#launch-utility).
+
+#### Train with multiple machines

 If you launch with multiple machines simply connected with ethernet, you can simply run following commands:
-
 On the first machine:

 ```shell
-NNODES=2 NODE_RANK=0 PORT=$MASTER_PORT MASTER_ADDR=$MASTER_ADDR sh tools/dist_train.sh $CONFIG $GPUS
+NNODES=2 NODE_RANK=0 PORT=${MASTER_PORT} MASTER_ADDR=${MASTER_ADDR} sh tools/dist_train.sh ${CONFIG_FILE} ${GPUS}
 ```

 On the second machine:

 ```shell
-NNODES=2 NODE_RANK=1 PORT=$MASTER_PORT MASTER_ADDR=$MASTER_ADDR sh tools/dist_train.sh $CONFIG $GPUS
+NNODES=2 NODE_RANK=1 PORT=${MASTER_PORT} MASTER_ADDR=${MASTER_ADDR} sh tools/dist_train.sh ${CONFIG_FILE} ${GPUS}
 ```

 Usually it is slow if you do not have high speed networking like InfiniBand.

-### Manage jobs with Slurm
+#### Manage jobs with Slurm

-Slurm is a good job scheduling system for computing clusters. On a cluster managed by Slurm, you can use slurm_train.sh to spawn training jobs. It supports both single-node and multi-node training.
-
-Train with multiple machines:
+[Slurm](https://slurm.schedmd.com/) is a good job scheduling system for computing clusters.
+On a cluster managed by Slurm, you can use `slurm_train.sh` to spawn training jobs. It supports both single-node and multi-node training.
+The basic usage is as follows.

 ```shell
 [GPUS=${GPUS}] sh tools/slurm_train.sh ${PARTITION} ${JOB_NAME} ${CONFIG_FILE} --work-dir ${WORK_DIR}
 ```

-Here is an example of using 16 GPUs to train PSPNet on the dev partition.
+Below is an example of using 4 GPUs to train PSPNet on a Slurm partition named _dev_, and set the work-dir to some shared file systems.

 ```shell
-GPUS=16 sh tools/slurm_train.sh dev pspr50 configs/pspnet/pspnet_r50-d8_512x1024_40k_cityscapes.py work_dirs/pspnet_r50-d8_512x1024_40k_cityscapes/
+GPUS=4 sh tools/slurm_train.sh dev pspnet configs/pspnet/pspnet_r50-d8_512x1024_40k_cityscapes.py --work-dir work_dir/pspnet
 ```

-When using 'slurm_train.sh' to start multiple tasks on a node, different ports need to be specified. Three settings are provided:
+You can check [the source code](../../../tools/dist_train.sh) to review full arguments and environment variables.
+When using Slurm, the port option need to be set in one of the following ways:

-Option 1:
+1. Set the port through `--cfg-options`. This is more recommended since it does not change the original configs.

-In `config1.py`:
+   ```shell
+   GPUS=4 GPUS_PER_NODE=4 sh tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config1.py ${WORK_DIR} --cfg-options env_cfg.dist_cfg.port=29500
+   GPUS=4 GPUS_PER_NODE=4 sh tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config2.py ${WORK_DIR} --cfg-options env_cfg.dist_cfg.port=29501
+   ```

-```python
-dist_params = dict(backend='nccl', port=29500)
-```
+2. Modify the config files to set different communication ports.
+   In `config1.py`:

-In `config2.py`:
+   ```python
+   enf_cfg = dict(dist_cfg=dict(backend='nccl', port=29500))
+   ```

-```python
-dist_params = dict(backend='nccl', port=29501)
-```
+   In `config2.py`:

-Then you can launch two jobs with config1.py and config2.py.
+   ```python
+   enf_cfg = dict(dist_cfg=dict(backend='nccl', port=29501))
+   ```
+
+   Then you can launch two jobs with config1.py and config2.py.
+
+   ```shell
+   CUDA_VISIBLE_DEVICES=0,1,2,3 GPUS=4 sh tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config1.py ${WORK_DIR}
+   CUDA_VISIBLE_DEVICES=4,5,6,7 GPUS=4 sh tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config2.py ${WORK_DIR}
+   ```
+
+3. Set the port in the command using the environment variable 'MASTER_PORT':

 ```shell
-CUDA_VISIBLE_DEVICES=0,1,2,3 GPUS=4 sh tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config1.py tmp_work_dir_1
-CUDA_VISIBLE_DEVICES=4,5,6,7 GPUS=4 sh tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config2.py tmp_work_dir_2
+CUDA_VISIBLE_DEVICES=0,1,2,3 GPUS=4 MASTER_PORT=29500 sh tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config1.py ${WORK_DIR}
+CUDA_VISIBLE_DEVICES=4,5,6,7 GPUS=4 MASTER_PORT=29501 sh tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config2.py ${WORK_DIR}
 ```

-Option 2:
+## Test models on standard datasets

-You can set different communication ports without the need to modify the configuration file, but have to set the `cfg-options` to overwrite the default port in configuration file.
+We provide testing scripts for evaluating an existing model on the whole dataset.
+The following testing environments are supported:
+
+- single GPU
+- CPU
+- single node multiple GPU
+- multiple node
+
+Choose the proper script to perform testing depending on the testing environment.

 ```shell
-CUDA_VISIBLE_DEVICES=0,1,2,3 GPUS=4 sh tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config1.py tmp_work_dir_1 --cfg-options dist_params.port=29500
-CUDA_VISIBLE_DEVICES=4,5,6,7 GPUS=4 sh tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config2.py tmp_work_dir_2 --cfg-options dist_params.port=29501
+# single-gpu testing
+python tools/test.py \
+    ${CONFIG_FILE} \
+    ${CHECKPOINT_FILE} \
+    [--work-dir ${WORK_DIR}] \
+    [--show ${SHOW_RESULTS}] \
+    [--show-dir ${VISUALIZATION_DIRECTORY}] \
+    [--wait-time ${SHOW_INTERVAL}] \
+    [--cfg-options ${OVERRIDE_CONFIGS}]
+# CPU testing
+export CUDA_VISIBLE_DEVICES=-1
+python tools/test.py \
+    ${CONFIG_FILE} \
+    ${CHECKPOINT_FILE} \
+    [--work-dir ${WORK_DIR}] \
+    [--show ${SHOW_RESULTS}] \
+    [--show-dir ${VISUALIZATION_DIRECTORY}] \
+    [--wait-time ${SHOW_INTERVAL}] \
+    [--cfg-options ${OVERRIDE_CONFIGS}]
+# multi-gpu testing
+bash tools/dist_test.sh \
+    ${CONFIG_FILE} \
+    ${CHECKPOINT_FILE} \
+    ${GPU_NUM} \
+    [--work-dir ${WORK_DIR}] \
+    [--cfg-options ${OVERRIDE_CONFIGS}]
 ```

-Option 3:
-
-You can set the port in the command using the environment variable 'MASTER_PORT':
+`tools/dist_test.sh` also supports multi-node testing, but relies on PyTorch's [launch utility](https://pytorch.org/docs/stable/distributed.html#launch-utility).
+[Slurm](https://slurm.schedmd.com/) is a good job scheduling system for computing clusters.
+On a cluster managed by Slurm, you can use `slurm_test.sh` to spawn testing jobs. It supports both single-node and multi-node testing.

 ```shell
-CUDA_VISIBLE_DEVICES=0,1,2,3 GPUS=4 MASTER_PORT=29500 sh tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config1.py tmp_work_dir_1
-CUDA_VISIBLE_DEVICES=4,5,6,7 GPUS=4 MASTER_PORT=29501 sh tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config2.py tmp_work_dir_2
+[GPUS=${GPUS}] ./tools/slurm_test.sh ${PARTITION} ${JOB_NAME} \
+    ${CONFIG_FILE} ${CHECKPOINT_FILE} \
+    [--work-dir ${OUTPUT_DIRECTORY}] \
+    [--cfg-options ${OVERRIDE_CONFIGS}]
 ```
+
+Optional arguments:
+
+- `--work-dir`: If specified, results will be saved in this directory. If not specified, the results will be automatically saved to `work_dirs/{CONFIG_NAME}`.
+- `--show`: Show prediction results at runtime, available when `--show-dir` is not specified.
+- `--show-dir`: If specified, the visualized segmentation mask will be saved in the specified directory.
+- `--wait-time`: The interval of show (s), which takes effect when `--show` is activated. Default to 2.
+- `--cfg-options`:  If specified, the key-value pair in xxx=yyy format will be merged into config file.
+  For example: To trade speed with GPU memory, you may pass in `--cfg-options model.backbone.with_cp=True` to enable checkpoint in backbone.
+  Below is the optional arguments for multi-gpu test:
+- `--launcher`: Items for distributed job initialization launcher. Allowed choices are `none`, `pytorch`, `slurm`, `mpi`. Especially, if set to none, it will test in a non-distributed mode.
+- `--local_rank`: ID for local rank. If not specified, it will be set to 0.
+  Examples:
+  Assume that you have already downloaded the checkpoints to the directory `checkpoints/`.
+
+1. Test PSPNet on PASCAL VOC (without saving the test results) and evaluate the mIoU.
+
+   ```shell
+   python tools/test.py configs/pspnet/pspnet_r50-d8_512x1024_20k_voc12aug.py \
+       checkpoints/pspnet_r50-d8_512x1024_20k_voc12aug_20200605_003338-c57ef100.pth
+   ```
+
+   Since `--work-dir` is not specified, the folder `work_dirs/pspnet_r50-d8_512x1024_20k_voc12aug` will be created automatically to save the evaluation results.
+
+2. Test PSPNet with 4 GPUs, and evaluate the standard mIoU and cityscapes metric.
+
+   ```shell
+   ./tools/dist_test.sh configs/pspnet/pspnet_r50-d8_512x1024_40k_cityscapes.py \
+       checkpoints/pspnet_r50-d8_512x1024_40k_cityscapes_20200605_003338-2966598c.pth 4
+   ```
+
+:::{note}
+There is some gap (~0.1%) between cityscapes mIoU and our mIoU. The reason is that cityscapes average each class with class size by default.
+We use the simple version without average for all datasets.
+:::