Merge pull request #2038 from cuicheng01/update_pulc_docs

update pulc docs
2022-06-13 17:35:53 +08:00 · 2022-06-13 17:35:53 +08:00 · 9b48fefd2e
parent 8a6acfbdd4 c1530e1eea
commit 9b48fefd2e
4 changed files with 26 additions and 19 deletions
--- a/docs/zh_CN/PULC/PULC_language_classification.md
+++ b/docs/zh_CN/PULC/PULC_language_classification.md
@ -112,7 +112,7 @@ print(next(result))

 ### 3.1 环境配置

- 安装：请先参考 [Paddle 安装教程](../installation/install_paddle.md) 以及 [PaddleClas 安装教程](../installation/install_paddleclas.md) 配置 PaddleClas 运行环境。
+* 安装：请先参考文档 [环境准备](../installation/install_paddleclas.md) 配置 PaddleClas 运行环境。

 <a name="3.2"></a>

@ -142,7 +142,7 @@ print(next(result))

 如果想要制作自己的多语种数据集，可以按照需求收集并整理自己任务中需要语种的数据，此处提供了经过上述方法处理好的demo数据，可以直接下载得到。

-**备注：**语种分类任务中的图片数据需要将整图中的文字区域抠取出来，仅仅使用文本行部分作为图片数据。
+**备注：** 语种分类任务中的图片数据需要将整图中的文字区域抠取出来，仅仅使用文本行部分作为图片数据。

 进入 PaddleClas 目录。

@ -178,7 +178,6 @@ cd ../

 -  这里的`label_list.txt`是4类语种分类模型对应的类别列表，如果自己构造的数据集语种类别发生变化，需要自行调整。
 -  如果想要自己构造训练集和验证集，可以参考[PaddleClas分类数据集格式说明](../data_preparation/classification_dataset.md#1-数据集格式说明) 。
-  当使用本文档中的demo数据集时，需要添加`-o Arch.class_num=4`来将模型的类别书指定为4。

 <a name="3.3"></a>

@ -191,9 +190,12 @@ export CUDA_VISIBLE_DEVICES=0,1,2,3
 python3 -m paddle.distributed.launch \
    --gpus="0,1,2,3" \
    tools/train.py \
-        -c ./ppcls/configs/PULC/language_classification/PPLCNet_x1_0.yaml
+        -c ./ppcls/configs/PULC/language_classification/PPLCNet_x1_0.yaml \
+        -o Arch.class_num=4
 ```

+-  由于本文档中的demo数据集的类别数量为 4，所以需要添加`-o Arch.class_num=4`来将模型的类别数量指定为4。
+
 <a name="3.4"></a>

 ### 3.4 模型评估
@ -203,7 +205,8 @@ python3 -m paddle.distributed.launch \
 ```bash
 python3 tools/eval.py \
    -c ./ppcls/configs/PULC/language_classification/PPLCNet_x1_0.yaml \
-    -o Global.pretrained_model="output/PPLCNet_x1_0/best_model"
+    -o Global.pretrained_model="output/PPLCNet_x1_0/best_model" \
+    -o Arch.class_num=4
 ```

 其中 `-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"` 指定了当前最佳权重所在的路径，如果指定其他权重，只需替换对应的路径即可。
@ -217,7 +220,8 @@ python3 tools/eval.py \
 ```bash
 python3 tools/infer.py \
    -c ./ppcls/configs/PULC/language_classification/PPLCNet_x1_0.yaml \
-    -o Global.pretrained_model="output/PPLCNet_x1_0/best_model"
+    -o Global.pretrained_model="output/PPLCNet_x1_0/best_model" \
+    -o Arch.class_num=4
 ```

 输出结果如下：
@ -253,8 +257,9 @@ export CUDA_VISIBLE_DEVICES=0,1,2,3
 python3 -m paddle.distributed.launch \
    --gpus="0,1,2,3" \
    tools/train.py \
-        -c ./ppcls/configs/PULC/language_classification/PPLCNet/PPLCNet_x1_0.yaml \
-        -o Arch.name=ResNet101_vd
+        -c ./ppcls/configs/PULC/language_classification/PPLCNet_x1_0.yaml \
+        -o Arch.name=ResNet101_vd \
+        -o Arch.class_num=4
 ```

 当前教师模型最好的权重保存在`output/ResNet101_vd/best_model.pdparams`。
@ -273,7 +278,8 @@ python3 -m paddle.distributed.launch \
    --gpus="0,1,2,3" \
    tools/train.py \
        -c ./ppcls/configs/PULC/language_classification/PPLCNet_x1_0_distillation.yaml \
-        -o Arch.models.0.Teacher.pretrained=output/ResNet101_vd/best_model
+        -o Arch.models.0.Teacher.pretrained=output/ResNet101_vd/best_model \
+        -o Arch.class_num=4
 ```

 当前模型最好的权重保存在`output/DistillationModel/best_model_student.pdparams`。
--- a/docs/zh_CN/PULC/PULC_safety_helmet.md
+++ b/docs/zh_CN/PULC/PULC_safety_helmet.md
@ -114,7 +114,7 @@ print(next(result))

 ### 3.1 环境配置

-* 安装：请先参考 [Paddle 安装教程](../installation/install_paddle.md) 以及 [PaddleClas 安装教程](../installation/install_paddleclas.md) 配置 PaddleClas 运行环境。
+* 安装：请先参考文档 [环境准备](../installation/install_paddleclas.md) 配置 PaddleClas 运行环境。

 <a name="3.2"></a>

@ -349,7 +349,7 @@ cd ../

 ```shell
 # 使用下面的命令使用 GPU 进行预测
-python3.7 python/predict_cls.py -c configs/PULC/safety_helmet/inference_safety_helmet.yaml
+c
 # 使用下面的命令使用 CPU 进行预测
 python3.7 python/predict_cls.py -c configs/PULC/safety_helmet/inference_safety_helmet.yaml -o Global.use_gpu=False
 ```
--- a/docs/zh_CN/PULC/PULC_text_image_orientation.md
+++ b/docs/zh_CN/PULC/PULC_text_image_orientation.md
@ -111,7 +111,7 @@ print(next(result))

 ### 3.1 环境配置

- 安装：请先参考 [Paddle 安装教程](../installation/install_paddle.md) 以及 [PaddleClas 安装教程](../installation/install_paddleclas.md) 配置 PaddleClas 运行环境。
+* 安装：请先参考文档 [环境准备](../installation/install_paddleclas.md) 配置 PaddleClas 运行环境。

 <a name="3.2"></a>

@ -209,7 +209,7 @@ python3 -m paddle.distributed.launch \

 验证集的最佳指标在0.99左右。

-**备注**：本文档中提到的训练指标均为在大规模内部数据上的训练指标，使用demo数据训练时，由于数据集规模较小且分布与大规模内部数据不同，无法达到该指标。可以进一步扩充自己的数据并且使用本案例中介绍的优化方法进行调优，从而达到更高的精度。
+**备注**：本文档中提到的训练指标均为在大规模内部数据上的训练指标，使用 demo 数据训练时，由于数据集规模较小且分布与大规模内部数据不同，无法达到该指标。可以进一步扩充自己的数据并且使用本案例中介绍的优化方法进行调优，从而达到更高的精度。

 <a name="3.4"></a>

@ -274,15 +274,15 @@ python3 -m paddle.distributed.launch \
        -o Arch.name=ResNet101_vd
 ```

-验证集的最佳指标为0.996左右，当前教师模型最好的权重保存在`output/ResNet101_vd/best_model.pdparams`。
+验证集的最佳指标为 0.996 左右，当前教师模型最好的权重保存在`output/ResNet101_vd/best_model.pdparams`。

-**备注：** 训练ResNet101_vd模型需要的显存较多，如果机器显存不够，可以将学习率和 batch size 同时缩小一定的倍数进行训练。
+**备注：** 训练 ResNet101_vd 模型需要的显存较多，如果机器显存不够，可以将学习率和 batch size 同时缩小一定的倍数进行训练。如在命令后添加以下参数 `-o DataLoader.Train.sampler.batch_size=64`, `Optimizer.lr.learning_rate=0.1`。

 <a name="4.1.2"></a>

 #### 4.1.2 蒸馏训练

-配置文件`ppcls/configs/PULC/text_image_orientation/PPLCNet_x1_0_distillation.yaml`提供了`SKL-UGI知识蒸馏策略`的配置。该配置将`ResNet101_vd`当作教师模型，`PPLCNet_x1_0`当作学生模型，使用[3.2.2节](#3.2.2)中介绍的蒸馏数据作为新增的无标签数据。训练脚本如下：
+配置文件`ppcls/configs/PULC/text_image_orientation/PPLCNet_x1_0_distillation.yaml`提供了`SKL-UGI 知识蒸馏策略`的配置。该配置将 `ResNet101_vd` 当作教师模型，`PPLCNet_x1_0` 当作学生模型，使用[3.2.2节](#3.2.2)中介绍的蒸馏数据作为新增的无标签数据。训练脚本如下：

 ```shell
 export CUDA_VISIBLE_DEVICES=0,1,2,3
--- a/ppcls/arch/backbone/legendary_models/resnet.py
+++ b/ppcls/arch/backbone/legendary_models/resnet.py
@ -26,6 +26,7 @@ from paddle.nn.initializer import Uniform
 from paddle.regularizer import L2Decay
 import math

+from ppcls.utils import logger
 from ppcls.arch.backbone.base.theseus_layer import TheseusLayer
 from ppcls.utils.save_load import load_dygraph_pretrain, load_dygraph_pretrain_from_url

@ -306,9 +307,9 @@ class ResNet(TheseusLayer):
            list, tuple
        )), "lr_mult_list should be in (list, tuple) but got {}".format(
            type(self.lr_mult_list))
-        assert len(self.lr_mult_list
-                   ) == 5, "lr_mult_list length should be 5 but got {}".format(
-                       len(self.lr_mult_list))
+        if len(self.lr_mult_list) != 5:
+            msg = "lr_mult_list length should be 5 but got {}, default lr_mult_list used".format(len(self.lr_mult_list))
+            logger.warning(msg)

        assert isinstance(self.stride_list, (
            list, tuple