[Docs] How to specify specific GPU training and inference (#503)

* 中文版指定GPU训练 * 删去不必要文件 * typo * rebase dev * add test example * add english version * fix format * english typo * Update docs/zh_cn/advanced_guides/how_to.md Co-authored-by: Range King <RangeKingHZ@gmail.com> * Update docs/zh_cn/advanced_guides/how_to.md Co-authored-by: Range King <RangeKingHZ@gmail.com> --------- Co-authored-by: Range King <RangeKingHZ@gmail.com>
2023-02-06 10:12:04 +08:00 · 2023-02-06 10:12:04 +08:00 · 1dee9eed6e
parent 6acde82ec8
commit 1dee9eed6e
2 changed files with 48 additions and 0 deletions
--- a/docs/en/advanced_guides/how_to.md
+++ b/docs/en/advanced_guides/how_to.md
@ -546,3 +546,27 @@ python ./tools/train.py \
 - `randomness.seed=2023`, set the random seed to 2023.
 - `randomness.diff_rank_seed=True`, set different seeds according to global rank. Defaults to False.
 - `randomness.deterministic=True`, set the deterministic option for cuDNN backend, i.e., set `torch.backends.cudnn.deterministic` to True and `torch.backends.cudnn.benchmark` to False. Defaults to False. See https://pytorch.org/docs/stable/notes/randomness.html for more details.
+
+## Specify specific GPUs during training or inference
+
+If you have multiple GPUs, such as 8 GPUs, numbered `0, 1, 2, 3, 4, 5, 6, 7`, GPU 0 will be used by default for training or inference. If you want to specify other GPUs for training or inference, you can use the following commands:
+
+```shell
+CUDA_VISIBLE_DEVICES=5 python ./tools/train.py ${CONFIG} #train
+CUDA_VISIBLE_DEVICES=5 python ./tools/test.py ${CONFIG} ${CHECKPOINT_FILE} #test
+```
+
+If you set `CUDA_VISIBLE_DEVICES` to -1 or a number greater than the maximum GPU number, such as 8, the CPU will be used for training or inference.
+
+If you want to use several of these GPUs to train in parallel, you can use the following command:
+
+```shell
+CUDA_VISIBLE_DEVICES=0,1,2,3 ./tools/dist_train.sh ${CONFIG} ${GPU_NUM}
+```
+
+Here the `GPU_NUM` is 4. In addition, if multiple tasks are trained in parallel on one machine and each task requires multiple GPUs, the PORT of each task need to be set differently to avoid communication conflict, like the following commands:
+
+```shell
+CUDA_VISIBLE_DEVICES=0,1,2,3 PORT=29500 ./tools/dist_train.sh ${CONFIG} 4
+CUDA_VISIBLE_DEVICES=4,5,6,7 PORT=29501 ./tools/dist_train.sh ${CONFIG} 4
+```
--- a/docs/zh_cn/advanced_guides/how_to.md
+++ b/docs/zh_cn/advanced_guides/how_to.md
@ -552,3 +552,27 @@ python ./tools/train.py \
 - `randomness.diff_rank_seed=True`，根据 rank 来设置不同的种子，`diff_rank_seed` 默认为 False。

 - `randomness.deterministic=True`，把 cuDNN 后端确定性选项设置为 True，即把`torch.backends.cudnn.deterministic` 设为 True，把 `torch.backends.cudnn.benchmark` 设为False。`deterministic` 默认为 False。更多细节见 https://pytorch.org/docs/stable/notes/randomness.html。
+
+## 指定特定 GPU 训练或推理
+
+如果你有多张 GPU，比如 8 张，其编号分别为 `0, 1, 2, 3, 4, 5, 6, 7`，使用单卡训练或推理时会默认使用卡 0。如果想指定其他卡进行训练或推理，可以使用以下命令：
+
+```shell
+CUDA_VISIBLE_DEVICES=5 python ./tools/train.py ${CONFIG} #train
+CUDA_VISIBLE_DEVICES=5 python ./tools/test.py ${CONFIG} ${CHECKPOINT_FILE} #test
+```
+
+如果设置`CUDA_VISIBLE_DEVICES`为 -1 或者一个大于 GPU 最大编号的数，比如 8，将会使用 CPU 进行训练或者推理。
+
+如果你想使用其中几张卡并行训练，可以使用如下命令：
+
+```shell
+CUDA_VISIBLE_DEVICES=0,1,2,3 ./tools/dist_train.sh ${CONFIG} ${GPU_NUM}
+```
+
+这里 `GPU_NUM` 为 4。另外如果在一台机器上多个任务同时多卡训练，需要设置不同的端口，比如以下命令：
+
+```shell
+CUDA_VISIBLE_DEVICES=0,1,2,3 PORT=29500 ./tools/dist_train.sh ${CONFIG} 4
+CUDA_VISIBLE_DEVICES=4,5,6,7 PORT=29501 ./tools/dist_train.sh ${CONFIG} 4
+```