[Docs] Refactor fine-tune tutorial. (#991)

* [Docs] refactor fine-tune tutorial * fix comment
2022-08-29 10:18:38 +08:00 · 2022-08-29 10:18:38 +08:00 · 344fa64400
parent 12d8c78222
commit 344fa64400
2 changed files with 151 additions and 163 deletions
--- a/docs/en/user_guides/finetune.md
+++ b/docs/en/user_guides/finetune.md
@ -1,55 +1,62 @@
-# Tutorial 2: Fine-tune Models
+# Fine-tune Models

+In most scenarios, we want to apply a model on new datasets without training from scratch, which might possibly introduce extra uncertainties about the model convergency and therefore, is time-consuming.
+The common sense is to learn from previous models trained on large dataset, which can hopefully provide better knowledge than a random beginner. Roughly speaking, this process is as known as fine-tuning.
 Classification models pre-trained on the ImageNet dataset have been demonstrated to be effective for other datasets and other downstream tasks.
-This tutorial provides instructions for users to use the models provided in the [Model Zoo](../model_zoo.md) for other datasets to obtain better performance.
+Hence, this tutorial provides instructions for users to use the models provided in the [Model Zoo](../model_zoo.md) for other datasets to obtain better performance.

 There are two steps to fine-tune a model on a new dataset.

- Add support for the new dataset following [Tutorial 3: Adding New Dataset](new_dataset.md).
+- Add support for the new dataset following [Prepare Dataset](dataset_prepare.md).
 - Modify the configs as will be discussed in this tutorial.

 Assume we have a ResNet-50 model pre-trained on the ImageNet-2012 dataset and want
-to take the fine-tuning on the CIFAR-10 dataset, we need to modify five parts in the
-config.
+to fine-tune on the CIFAR-10 dataset, we need to modify five parts in the config.

 ## Inherit base configs

 At first, create a new config file
-`configs/tutorial/resnet50_finetune_cifar.py` to store our configs. Of course,
+`configs/tutorial/resnet50_finetune_cifar.py` to store our fine-tune configs. Of course,
 the path can be customized by yourself.

-To reuse the common parts among different configs, we support inheriting
-configs from multiple existing configs. To fine-tune a ResNet-50 model, the new
-config needs to inherit `configs/_base_/models/resnet50.py` to build the basic
-structure of the model. To use the CIFAR-10 dataset, the new config can also
-simply inherit `configs/_base_/datasets/cifar10_bs16.py`. For runtime settings such as
-training schedules, the new config needs to inherit
-`configs/_base_/default_runtime.py`.
+To reuse the common parts among different base configs, we support inheriting
+configs from multiple existing configs.Including following four parts：

-To inherit all above configs, put the following code at the config file.
+- Model configs: To fine-tune a ResNet-50 model, the new
+  config needs to inherit `configs/_base_/models/resnet50.py` to build the basic structure of the model.
+- Dataset configs: To use the CIFAR-10 dataset, the new config can simply
+  inherit `configs/_base_/datasets/cifar10_bs16.py`.
+- Schedule configs: The new config can inherit `_base_/schedules/cifar10_bs128.py`
+  for CIFAR-10 dataset with a batch size of 128.
+- Runtime configs: For runtime settings such as basic hooks, etc.,
+  the new config needs to inherit `configs/_base_/default_runtime.py`.
+
+To inherit all configs above, put the following code at the config file.

 ```python
 _base_ = [
    '../_base_/models/resnet50.py',
-    '../_base_/datasets/cifar10_bs16.py', '../_base_/default_runtime.py'
+    '../_base_/datasets/cifar10_bs16.py',
+    '../_base_/schedules/cifar10_bs128.py',
+    '../_base_/default_runtime.py',
 ]
 ```

-Besides, you can also choose to write the whole contents rather than use inheritance,
-like [`configs/lenet/lenet5_mnist.py`](https://github.com/open-mmlab/mmclassification/blob/master/configs/lenet/lenet5_mnist.py).
+Besides, you can also choose to write the whole contents rather than use inheritance.
+Refers to [`configs/lenet/lenet5_mnist.py`](https://github.com/open-mmlab/mmclassification/blob/master/configs/lenet/lenet5_mnist.py) for more details.

-## Modify model
+## Modify model configs

 When fine-tuning a model, usually we want to load the pre-trained backbone
-weights and train a new classification head.
+weights and train a new classification head from scratch.

 To load the pre-trained backbone, we need to change the initialization config
 of the backbone and use `Pretrained` initialization function. Besides, in the
-`init_cfg`, we use `prefix='backbone'` to tell the initialization
-function to remove the prefix of keys in the checkpoint, for example, it will
-change `backbone.conv1` to `conv1`. And here we use an online checkpoint, it
-will be downloaded during training, you can also download the model manually
-and use a local path.
+`init_cfg`, we use `prefix='backbone'` to tell the initialization function
+the prefix of the submodule that needs to be loaded in the checkpoint.
+For example, `backbone` here means to load the backbone submodule. And here we
+use an online checkpoint, it will be downloaded automatically during training,
+you can also download the model manually and use a local path.

 And then we need to modify the head according to the class numbers of the new
 datasets by just changing `num_classes` in the head.
@ -71,11 +78,12 @@ Here we only need to set the part of configs we want to modify, because the
 inherited configs will be merged and get the entire configs.
 ```

-Sometimes, we want to freeze the first several layers' parameters of the
+When new dataset is small and shares the domain with the pre-trained dataset,
+we might want to freeze the first several stages' parameters of the
 backbone, that will help the network to keep ability to extract low-level
 information learnt from pre-trained model. In MMClassification, you can simply
-specify how many layers to freeze by `frozen_stages` argument. For example, to
-freeze the first two layers' parameters, just use the following config:
+specify how many stages to freeze by `frozen_stages` argument. For example, to
+freeze the first two stages' parameters, just use the following configs:

 ```python
 model = dict(
@ -92,58 +100,51 @@ model = dict(

 ```{note}
 Not all backbones support the `frozen_stages` argument by now. Please check
-[the docs](https://mmclassification.readthedocs.io/en/latest/api/models.html#backbones)
+[the docs](https://mmclassification.readthedocs.io/en/1.x/api.html#module-mmcls.models.backbones)
 to confirm if your backbone supports it.
 ```

-## Modify dataset
+## Modify dataset configs

 When fine-tuning on a new dataset, usually we need to modify some dataset
 configs. Here, we need to modify the pipeline to resize the image from 32 to
-224 to fit the input size of the model pre-trained on ImageNet, and some other
-configs.
+224 to fit the input size of the model pre-trained on ImageNet, and modify
+dataloaders correspondingly.

 ```python
-img_norm_cfg = dict(
-    mean=[125.307, 122.961, 113.8575],
-    std=[51.5865, 50.847, 51.255],
-    to_rgb=False,
-)
+# data pipeline settings
 train_pipeline = [
-    dict(type='RandomCrop', size=32, padding=4),
-    dict(type='RandomFlip', flip_prob=0.5, direction='horizontal'),
+    dict(type='RandomCrop', crop_size=32, padding=4),
+    dict(type='RandomFlip', prob=0.5, direction='horizontal'),
    dict(type='Resize', scale=224),
-    dict(type='Normalize', **img_norm_cfg),
-    dict(type='ImageToTensor', keys=['img']),
-    dict(type='ToTensor', keys=['gt_label']),
-    dict(type='Collect', keys=['img', 'gt_label']),
+    dict(type='PackClsInputs'),
 ]
 test_pipeline = [
    dict(type='Resize', scale=224),
-    dict(type='Normalize', **img_norm_cfg),
-    dict(type='ImageToTensor', keys=['img']),
-    dict(type='Collect', keys=['img']),
+    dict(type='PackClsInputs'),
 ]
-data = dict(
-    train=dict(pipeline=train_pipeline),
-    val=dict(pipeline=test_pipeline),
-    test=dict(pipeline=test_pipeline),
-)
+# dataloader settings
+train_dataloader = dict(dataset=dict(pipeline=train_pipeline))
+val_dataloader = dict(dataset=dict(pipeline=test_pipeline))
+test_dataloader = val_dataloader
 ```

-## Modify training schedule
+## Modify training schedule configs

 The fine-tuning hyper parameters vary from the default schedule. It usually
-requires smaller learning rate and less training epochs.
+requires smaller learning rate and quicker decaying scheduler epochs.

 ```python
 # lr is set for a batch size of 128
-optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001)
-optimizer_config = dict(grad_clip=None)
+optim_wrapper = dict(
+    optimizer=dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001))
 # learning policy
-lr_config = dict(policy='step', step=[15])
-runner = dict(type='EpochBasedRunner', max_epochs=200)
-log_config = dict(interval=100)
+param_scheduler = dict(
+    type='MultiStepLR', by_epoch=True, milestones=[15], gamma=0.1)
+```
+
+```{tip}
+Refers to [Learn about Configs](config.md) for more detailed configurations.
 ```

 ## Start Training
@ -153,7 +154,9 @@ Now, we have finished the fine-tuning config file as following:
 ```python
 _base_ = [
    '../_base_/models/resnet50.py',
-    '../_base_/datasets/cifar10_bs16.py', '../_base_/default_runtime.py'
+    '../_base_/datasets/cifar10_bs16.py',
+    '../_base_/schedules/cifar10_bs128.py',
+    '../_base_/default_runtime.py',
 ]

 # Model config
@ -169,44 +172,32 @@ model = dict(
 )

 # Dataset config
-img_norm_cfg = dict(
-    mean=[125.307, 122.961, 113.8575],
-    std=[51.5865, 50.847, 51.255],
-    to_rgb=False,
-)
+# data pipeline settings
 train_pipeline = [
-    dict(type='RandomCrop', size=32, padding=4),
-    dict(type='RandomFlip', flip_prob=0.5, direction='horizontal'),
+    dict(type='RandomCrop', crop_size=32, padding=4),
+    dict(type='RandomFlip', prob=0.5, direction='horizontal'),
    dict(type='Resize', scale=224),
-    dict(type='Normalize', **img_norm_cfg),
-    dict(type='ImageToTensor', keys=['img']),
-    dict(type='ToTensor', keys=['gt_label']),
-    dict(type='Collect', keys=['img', 'gt_label']),
+    dict(type='PackClsInputs'),
 ]
 test_pipeline = [
    dict(type='Resize', scale=224),
-    dict(type='Normalize', **img_norm_cfg),
-    dict(type='ImageToTensor', keys=['img']),
-    dict(type='Collect', keys=['img']),
+    dict(type='PackClsInputs'),
 ]
-data = dict(
-    train=dict(pipeline=train_pipeline),
-    val=dict(pipeline=test_pipeline),
-    test=dict(pipeline=test_pipeline),
-)
+# dataloader settings
+train_dataloader = dict(dataset=dict(pipeline=train_pipeline))
+val_dataloader = dict(dataset=dict(pipeline=test_pipeline))
+test_dataloader = val_dataloader

 # Training schedule config
 # lr is set for a batch size of 128
-optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001)
-optimizer_config = dict(grad_clip=None)
+optim_wrapper = dict(
+    optimizer=dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001))
 # learning policy
-lr_config = dict(policy='step', step=[15])
-runner = dict(type='EpochBasedRunner', max_epochs=200)
-log_config = dict(interval=100)
+param_scheduler = dict(
+    type='MultiStepLR', by_epoch=True, milestones=[15], gamma=0.1)
 ```

-Here we use 8 GPUs on your computer to train the model with the following
-command:
+Here we use 8 GPUs on your computer to train the model with the following command:

 ```shell
 bash tools/dist_train.sh configs/tutorial/resnet50_finetune_cifar.py 8
@ -222,15 +213,18 @@ But wait, an important config need to be changed if using one GPU. We need to
 change the dataset config as following:

 ```python
-data = dict(
-    samples_per_gpu=128,
-    train=dict(pipeline=train_pipeline),
-    val=dict(pipeline=test_pipeline),
-    test=dict(pipeline=test_pipeline),
+train_dataloader = dict(
+    batch_size=128,
+    dataset=dict(pipeline=train_pipeline),
 )
+val_dataloader = dict(
+    batch_size=128,
+    dataset=dict(pipeline=test_pipeline),
+)
+test_dataloader = val_dataloader
 ```

 It's because our training schedule is for a batch size of 128. If using 8 GPUs,
-just use `samples_per_gpu=16` config in the base config file, and the total batch
+just use `batch_size=16` config in the base config file for every GPU, and the total batch
 size will be 128. But if using one GPU, you need to change it to 128 manually to
 match the training schedule.
--- a/docs/zh_CN/user_guides/finetune.md
+++ b/docs/zh_CN/user_guides/finetune.md
@ -1,12 +1,14 @@
-# 教程 2：如何微调模型
+# 如何微调模型

+在很多场景下，我们需要快速地将模型应用到新的数据集上，但从头训练模型通常很难快速收敛，这种不确定性会浪费额外的时间。
+通常，已有的、在大数据集上训练好的模型会比随机初始化提供更为有效的先验信息，粗略来讲，在此基础上的学习我们称之为模型微调。
 已经证明，在 ImageNet 数据集上预先训练的分类模型对于其他数据集和其他下游任务有很好的效果。

-该教程提供了如何将 [Model Zoo](https://github.com/open-mmlab/mmclassification/blob/master/docs/model_zoo.md) 中提供的预训练模型用于其他数据集，已获得更好的效果。
+因此，该教程提供了如何将 [Model Zoo](../model_zoo.md) 中提供的预训练模型用于其他数据集，已获得更好的效果。

 在新数据集上微调模型分为两步：

- 按照 [教程 3：如何增加新数据集](new_dataset.md) 添加对新数据集的支持。
+- 按照 [数据集准备](dataset_prepare.md) 添加对新数据集的支持。
 - 按照本教程中讨论的内容修改配置文件

 假设我们现在有一个在 ImageNet-2012 数据集上训练好的 ResNet-50 模型，并且希望在
@ -16,18 +18,21 @@ CIFAR-10 数据集上进行模型微调，我们需要修改配置文件中的

 首先，创建一个新的配置文件 `configs/tutorial/resnet50_finetune_cifar.py` 来保存我们的配置，当然，这个文件名可以自由设定。

-为了重用不同配置之间的通用部分，我们支持从多个现有配置中继承配置。要微调
-ResNet-50 模型，新配置需要继承 `_base_/models/resnet50.py` 来搭建模型的基本结构。
-为了使用 CIFAR10 数据集，新的配置文件可以直接继承 `_base_/datasets/cifar10.py`。
-而为了保留运行相关设置，比如训练调整器，新的配置文件需要继承
-`_base_/default_runtime.py`。
+为了重用不同基础配置之间的通用部分，我们支持从多个现有配置中继承配置，其中包括：
+
+- 模型配置：要微调 ResNet-50 模型，可以继承 `_base_/models/resnet50.py` 来搭建模型的基本结构。
+- 数据集配置：使用 CIFAR10 数据集，可以继承 `_base_/datasets/cifar10_bs16.py`。
+- 训练策略配置：可以继承 batchsize 为 128 的 CIFAR10 数据集基本训练配置文件`_base_/schedules/cifar10_bs128.py`。
+- 运行配置：为了保留运行相关设置，比如默认训练钩子、环境配置等，需要继承 `_base_/default_runtime.py`。

 要继承以上这些配置文件，只需要把下面一段代码放在我们的配置文件开头。

 ```python
 _base_ = [
    '../_base_/models/resnet50.py',
-    '../_base_/datasets/cifar10.py', '../_base_/default_runtime.py'
+    '../_base_/datasets/cifar10_bs16.py',
+    '../_base_/schedules/cifar10_bs128.py',
+    '../_base_/default_runtime.py',
 ]
 ```

@ -39,9 +44,9 @@ _base_ = [
 在进行模型微调是，我们通常希望在主干网络（backbone）加载预训练模型，再用我们的数据集训练一个新的分类头（head）。

 为了在主干网络加载预训练模型，我们需要修改主干网络的初始化设置，使用
-`Pretrained` 类型的初始化函数。另外，在初始化设置中，我们使用
-`prefix='backbone'` 来告诉初始化函数移除权重文件中键值名称的前缀，比如把
-`backbone.conv1` 变成 `conv1`。方便起见，我们这里使用一个在线的权重文件链接，它
+`Pretrained` 类型的初始化函数。另外，在初始化设置中，我们使用 `prefix='backbone'`
+来告诉初始化函数需要加载的子模块的前缀，`backbone`即指加载模型中的主干网络。
+方便起见，我们这里使用一个在线的权重文件链接，它
 会在训练前自动下载对应的文件，你也可以提前下载这个模型，然后使用本地路径。

 接下来，新的配置文件需要按照新数据集的类别数目来修改分类头的配置。只需要修改分
@ -63,8 +68,9 @@ model = dict(
 这里我们只需要设定我们想要修改的部分配置，其他配置将会自动从我们的父配置文件中获取。
 ```

-另外，有时我们在进行微调时会希望冻结主干网络前面几层的参数，这么做有助于在后续
-训练中，保持网络从预训练权重中获得的提取低阶特征的能力。在 MMClassification 中，
+另外，当新的小数据集和原本预训练的大数据中的数据分布较为类似的话，我们在进行微调时会希望
+冻结主干网络前面几层的参数，只训练后面层以及分类头的参数，这么做有助于在后续训练中，
+保持网络从预训练权重中获得的提取低阶特征的能力。在 MMClassification 中，
 这一功能可以通过简单的一个 `frozen_stages` 参数来实现。比如我们需要冻结前两层网
 络的参数，只需要在上面的配置中添加一行：

@ -83,7 +89,7 @@ model = dict(

 ```{note}
 目前还不是所有的网络都支持 `frozen_stages` 参数，在使用之前，请先检查
-[文档](https://mmclassification.readthedocs.io/zh_CN/latest/api/models.html#backbones)
+[文档](https://mmclassification.readthedocs.io/zh_CN/1.x/api.html#module-mmcls.models.backbones)
 以确认你所使用的主干网络是否支持。
 ```

@ -91,48 +97,41 @@ model = dict(

 当针对一个新的数据集进行微调时，我们通常都需要修改一些数据集相关的配置。比如这
 里，我们就需要把 CIFAR-10 数据集中的图像大小从 32 缩放到 224 来配合 ImageNet 上
-预训练模型的输入。这一需要可以通过修改数据集的预处理流水线（pipeline）来实现。
+预训练模型的输入。这一需要可以通过修改数据集的预处理流水线（pipeline）并覆盖数据加载器（dataloader）来实现。

 ```python
-img_norm_cfg = dict(
-    mean=[125.307, 122.961, 113.8575],
-    std=[51.5865, 50.847, 51.255],
-    to_rgb=False,
-)
+# 数据流水线设置
 train_pipeline = [
-    dict(type='RandomCrop', size=32, padding=4),
-    dict(type='RandomFlip', flip_prob=0.5, direction='horizontal'),
+    dict(type='RandomCrop', crop_size=32, padding=4),
+    dict(type='RandomFlip', prob=0.5, direction='horizontal'),
    dict(type='Resize', scale=224),
-    dict(type='Normalize', **img_norm_cfg),
-    dict(type='ImageToTensor', keys=['img']),
-    dict(type='ToTensor', keys=['gt_label']),
-    dict(type='Collect', keys=['img', 'gt_label']),
+    dict(type='PackClsInputs'),
 ]
 test_pipeline = [
    dict(type='Resize', scale=224),
-    dict(type='Normalize', **img_norm_cfg),
-    dict(type='ImageToTensor', keys=['img']),
-    dict(type='Collect', keys=['img']),
+    dict(type='PackClsInputs'),
 ]
-data = dict(
-    train=dict(pipeline=train_pipeline),
-    val=dict(pipeline=test_pipeline),
-    test=dict(pipeline=test_pipeline),
-)
+# 数据加载器设置
+train_dataloader = dict(dataset=dict(pipeline=train_pipeline))
+val_dataloader = dict(dataset=dict(pipeline=test_pipeline))
+test_dataloader = val_dataloader
 ```

 ## 修改训练策略设置

-用于微调任务的超参数与默认配置不同，通常只需要较小的学习率和较少的训练时间。
+用于微调任务的超参数与默认配置不同，通常只需要较小的学习率以及较快的衰减策略。

 ```python
 # 用于批大小为 128 的优化器学习率
-optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001)
-optimizer_config = dict(grad_clip=None)
+optim_wrapper = dict(
+    optimizer=dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001))
 # 学习率衰减策略
-lr_config = dict(policy='step', step=[15])
-runner = dict(type='EpochBasedRunner', max_epochs=200)
-log_config = dict(interval=100)
+param_scheduler = dict(
+    type='MultiStepLR', by_epoch=True, milestones=[15], gamma=0.1)
+```
+
+```{tip}
+更多可修改的细节可以参考[如何编写配置文件](config.md).
 ```

 ## 开始训练
@ -142,7 +141,9 @@ log_config = dict(interval=100)
 ```python
 _base_ = [
    '../_base_/models/resnet50.py',
-    '../_base_/datasets/cifar10_bs16.py', '../_base_/default_runtime.py'
+    '../_base_/datasets/cifar10_bs16.py',
+    '../_base_/schedules/cifar10_bs128.py',
+    '../_base_/default_runtime.py',
 ]

 # 模型设置
@ -158,40 +159,30 @@ model = dict(
 )

 # 数据集设置
-img_norm_cfg = dict(
-    mean=[125.307, 122.961, 113.8575],
-    std=[51.5865, 50.847, 51.255],
-    to_rgb=False,
-)
+# 数据流水线设置
 train_pipeline = [
-    dict(type='RandomCrop', size=32, padding=4),
-    dict(type='RandomFlip', flip_prob=0.5, direction='horizontal'),
+    dict(type='RandomCrop', crop_size=32, padding=4),
+    dict(type='RandomFlip', prob=0.5, direction='horizontal'),
    dict(type='Resize', scale=224),
-    dict(type='Normalize', **img_norm_cfg),
-    dict(type='ImageToTensor', keys=['img']),
-    dict(type='ToTensor', keys=['gt_label']),
-    dict(type='Collect', keys=['img', 'gt_label']),
+    dict(type='PackClsInputs'),
 ]
 test_pipeline = [
    dict(type='Resize', scale=224),
-    dict(type='Normalize', **img_norm_cfg),
-    dict(type='ImageToTensor', keys=['img']),
-    dict(type='Collect', keys=['img']),
+    dict(type='PackClsInputs'),
 ]
-data = dict(
-    train=dict(pipeline=train_pipeline),
-    val=dict(pipeline=test_pipeline),
-    test=dict(pipeline=test_pipeline),
-)
+# 数据加载器设置
+train_dataloader = dict(dataset=dict(pipeline=train_pipeline))
+val_dataloader = dict(dataset=dict(pipeline=test_pipeline))
+test_dataloader = val_dataloader

 # 训练策略设置
 # 用于批大小为 128 的优化器学习率
-optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001)
-optimizer_config = dict(grad_clip=None)
+optim_wrapper = dict(
+    optimizer=dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001))
 # 学习率衰减策略
-lr_config = dict(policy='step', step=[15])
-runner = dict(type='EpochBasedRunner', max_epochs=200)
-log_config = dict(interval=100)
+param_scheduler = dict(
+    type='MultiStepLR', by_epoch=True, milestones=[15], gamma=0.1)
+
 ```

 接下来，我们使用一台 8 张 GPU 的电脑来训练我们的模型，指令如下：
@ -209,14 +200,17 @@ python tools/train.py configs/tutorial/resnet50_finetune_cifar.py
 但是如果我们使用单张 GPU 进行训练的话，需要在数据集设置部分作如下修改：

 ```python
-data = dict(
-    samples_per_gpu=128,
-    train=dict(pipeline=train_pipeline),
-    val=dict(pipeline=test_pipeline),
-    test=dict(pipeline=test_pipeline),
+train_dataloader = dict(
+    batch_size=128,
+    dataset=dict(pipeline=train_pipeline),
 )
+val_dataloader = dict(
+    batch_size=128,
+    dataset=dict(pipeline=test_pipeline),
+)
+test_dataloader = val_dataloader
 ```

 这是因为我们的训练策略是针对批次大小（batch size）为 128 设置的。在父配置文件中，
-设置了 `samples_per_gpu=16`，如果使用 8 张 GPU，总的批次大小就是 128。而如果使
-用单张 GPU，就必须手动修改 `samples_per_gpu=128` 来匹配训练策略。
+设置了单张 `batch_size=16`，如果使用 8 张 GPU，总的批次大小就是 128。而如果使
+用单张 GPU，就必须手动修改 `batch_size=128` 来匹配训练策略。