Merge branch 'PaddlePaddle:develop' into develop

2021-09-01 15:10:14 +08:00 · 2021-09-01 15:10:14 +08:00 · 1210665a8b
parent 508d664ebb a065c52b4e
commit 1210665a8b
30 changed files with 1414 additions and 879 deletions
--- a/docs/en/extension/VisualDL_en.md
+++ b/docs/en/extension/VisualDL_en.md
@ -7,29 +7,30 @@ VisualDL, a visualization analysis tool of PaddlePaddle, provides a variety of c
 Now PaddleClas support use VisualDL to visualize the changes of learning rate, loss, accuracy in training.

 ### Set config and start training
-You only need to set the `vdl_dir` field in train config:
+You only need to set the field `Global.use_visualdl` to `True` in train config:

 ```yaml
 # config.yaml
-vdl_dir: "./vdl.log"
+Global:
+...
+  use_visualdl: True
+...
 ```

-`vdl_dir`: Specify the directory where VisualDL stores logs.
-
-Then normal start training:
+PaddleClas will save the VisualDL logs to subdirectory `vdl/` under the output directory specified by `Global.output_dir`. And then you just need to start training normally:

 ```shell
 python3 tools/train.py -c config.yaml
 ```

 ### Start VisualDL
-After starting the training program, you can start the VisualDL service in the new terminal session:
+After starting the training program, you can start the VisualDL service in a new terminal session:

 ```shell
- visualdl --logdir ./vdl.log
+ visualdl --logdir ./output/vdl/
 ```

-In the above command, `--logdir` specify the logs directory. VisualDL will traverse and iterate to find the subdirectories of the specified directory to visualize all the experimental results. You can also use the following parameters to set the IP and port number of the VisualDL service:
+In the above command, `--logdir` specify the directory of the VisualDL logs produced in training. VisualDL will traverse and iterate to find the subdirectories of the specified directory to visualize all the experimental results. You can also use the following parameters to set the IP and port number of the VisualDL service:

 * `--host`：ip, default is 127.0.0.1
 * `--port`：port, default is 8040
--- a/docs/en/tutorials/getting_started_en.md
+++ b/docs/en/tutorials/getting_started_en.md
@ -23,7 +23,7 @@ Among them, `-c` is used to specify the path of the configuration file, `-o` is
 `-o use_gpu=True` means to use GPU for training. If you want to use the CPU for training, you need to set `use_gpu` to `False`.


-Of course, you can also directly modify the configuration file to update the configuration. For specific configuration parameters, please refer to [Configuration Document](config_en.md).
+Of course, you can also directly modify the configuration file to update the configuration. For specific configuration parameters, please refer to [Configuration Document](config_description_en.md).

 * The output log examples are as follows:
    * If mixup or cutmix is used in training, top-1 and top-k (default by 5) will not be printed in the log:
--- a/docs/zh_CN/extension/VisualDL.md
+++ b/docs/zh_CN/extension/VisualDL.md
@ -7,15 +7,17 @@ VisualDL是飞桨可视化分析工具，以丰富的图表呈现训练参数变
 现在PaddleClas支持在训练阶段使用VisualDL查看训练过程中学习率（learning rate）、损失值（loss）以及准确率（accuracy）的变化情况。

 ### 设置config文件并启动训练
-在PaddleClas中使用VisualDL，只需在训练配置文件（config文件）添加如下字段：
+在PaddleClas中使用VisualDL，只需在训练配置文件（config文件）中设置字段 `Global.use_visualdl` 为 `True`：

 ```yaml
 # config.yaml
-vdl_dir: "./vdl.log"
+Global:
+...
+  use_visualdl: True
+...
 ```
-`vdl_dir` 用于指定VisualDL用于保存log信息的目录。

-然后正常启动训练即可：
+PaddleClas 会将 VisualDL 的日志保存在 `Global.output_dir` 字段指定目录下的 `vdl/` 子目录下，然后正常启动训练即可：

 ```shell
 python3 tools/train.py -c config.yaml
@ -25,10 +27,10 @@ python3 tools/train.py -c config.yaml
 在启动训练程序后，可以在新的终端session中启动VisualDL服务：

 ```shell
- visualdl --logdir ./vdl.log
+ visualdl --logdir ./output/vdl/
 ```

-上述命令中，参数`--logdir`用于指定日志目录，VisualDL将遍历并且迭代寻找指定目录的子目录，将所有实验结果进行可视化。也同样可以使用下述参数设定VisualDL服务的ip及端口号：
+上述命令中，参数`--logdir`用于指定保存 VisualDL 日志的目录，VisualDL将遍历并且迭代寻找指定目录的子目录，将所有实验结果进行可视化。也同样可以使用下述参数设定VisualDL服务的ip及端口号：
 * `--host`：设定IP，默认为127.0.0.1
 * `--port`：设定端口，默认为8040

--- a/docs/zh_CN/tutorials/getting_started.md
+++ b/docs/zh_CN/tutorials/getting_started.md
@ -30,7 +30,7 @@ python3 tools/train.py \

 其中，`-c`用于指定配置文件的路径，`-o`用于指定需要修改或者添加的参数，其中`-o Arch.pretrained=False`表示不使用预训练模型，`-o Global.device=gpu`表示使用GPU进行训练。如果希望使用CPU进行训练，则需要将`Global.device`设置为`cpu`。

-更详细的训练配置，也可以直接修改模型对应的配置文件。具体配置参数参考[配置文档](config.md)。
+更详细的训练配置，也可以直接修改模型对应的配置文件。具体配置参数参考[配置文档](config_description.md)。

 运行上述命令，可以看到输出日志，示例如下：

--- a/hubconf.py
+++ b/hubconf.py
@ -41,10 +41,15 @@ class _SysPathG(object):
            self.path)


-with _SysPathG(
-        os.path.join(
-            os.path.dirname(os.path.abspath(__file__)), 'ppcls', 'arch')):
-    import backbone
+with _SysPathG(os.path.dirname(os.path.abspath(__file__)), ):
+    import ppcls
+    import ppcls.arch.backbone as backbone
+
+    def ppclas_init():
+        if ppcls.utils.logger._logger is None:
+            ppcls.utils.logger.init_logger()
+
+    ppclas_init()

    def _load_pretrained_parameters(model, name):
        url = 'https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/{}_pretrained.pdparams'.format(
@ -63,9 +68,8 @@ with _SysPathG(
        Returns:
            model: nn.Layer. Specific `AlexNet` model depends on args.
        """
+        kwargs.update({'pretrained': pretrained})
        model = backbone.AlexNet(**kwargs)
-        if pretrained:
-            model = _load_pretrained_parameters(model, 'AlexNet')

        return model

@ -80,9 +84,8 @@ with _SysPathG(
        Returns:
            model: nn.Layer. Specific `VGG11` model depends on args.
        """
+        kwargs.update({'pretrained': pretrained})
        model = backbone.VGG11(**kwargs)
-        if pretrained:
-            model = _load_pretrained_parameters(model, 'VGG11')

        return model

@ -97,9 +100,8 @@ with _SysPathG(
        Returns:
            model: nn.Layer. Specific `VGG13` model depends on args.
        """
+        kwargs.update({'pretrained': pretrained})
        model = backbone.VGG13(**kwargs)
-        if pretrained:
-            model = _load_pretrained_parameters(model, 'VGG13')

        return model

@ -114,9 +116,8 @@ with _SysPathG(
        Returns:
            model: nn.Layer. Specific `VGG16` model depends on args.
        """
+        kwargs.update({'pretrained': pretrained})
        model = backbone.VGG16(**kwargs)
-        if pretrained:
-            model = _load_pretrained_parameters(model, 'VGG16')

        return model

@ -131,9 +132,8 @@ with _SysPathG(
        Returns:
            model: nn.Layer. Specific `VGG19` model depends on args.
        """
+        kwargs.update({'pretrained': pretrained})
        model = backbone.VGG19(**kwargs)
-        if pretrained:
-            model = _load_pretrained_parameters(model, 'VGG19')

        return model

@ -149,9 +149,8 @@ with _SysPathG(
        Returns:
            model: nn.Layer. Specific `ResNet18` model depends on args.
        """
+        kwargs.update({'pretrained': pretrained})
        model = backbone.ResNet18(**kwargs)
-        if pretrained:
-            model = _load_pretrained_parameters(model, 'ResNet18')

        return model

@ -167,9 +166,8 @@ with _SysPathG(
        Returns:
            model: nn.Layer. Specific `ResNet34` model depends on args.
        """
+        kwargs.update({'pretrained': pretrained})
        model = backbone.ResNet34(**kwargs)
-        if pretrained:
-            model = _load_pretrained_parameters(model, 'ResNet34')

        return model

@ -185,9 +183,8 @@ with _SysPathG(
        Returns:
            model: nn.Layer. Specific `ResNet50` model depends on args.
        """
+        kwargs.update({'pretrained': pretrained})
        model = backbone.ResNet50(**kwargs)
-        if pretrained:
-            model = _load_pretrained_parameters(model, 'ResNet50')

        return model

@ -203,9 +200,8 @@ with _SysPathG(
        Returns:
            model: nn.Layer. Specific `ResNet101` model depends on args.
        """
+        kwargs.update({'pretrained': pretrained})
        model = backbone.ResNet101(**kwargs)
-        if pretrained:
-            model = _load_pretrained_parameters(model, 'ResNet101')

        return model

@ -221,9 +217,8 @@ with _SysPathG(
        Returns:
            model: nn.Layer. Specific `ResNet152` model depends on args.
        """
+        kwargs.update({'pretrained': pretrained})
        model = backbone.ResNet152(**kwargs)
-        if pretrained:
-            model = _load_pretrained_parameters(model, 'ResNet152')

        return model

@ -237,9 +232,8 @@ with _SysPathG(
        Returns:
            model: nn.Layer. Specific `SqueezeNet1_0` model depends on args.
        """
+        kwargs.update({'pretrained': pretrained})
        model = backbone.SqueezeNet1_0(**kwargs)
-        if pretrained:
-            model = _load_pretrained_parameters(model, 'SqueezeNet1_0')

        return model

@ -253,9 +247,8 @@ with _SysPathG(
        Returns:
            model: nn.Layer. Specific `SqueezeNet1_1` model depends on args.
        """
+        kwargs.update({'pretrained': pretrained})
        model = backbone.SqueezeNet1_1(**kwargs)
-        if pretrained:
-            model = _load_pretrained_parameters(model, 'SqueezeNet1_1')

        return model

@ -271,9 +264,8 @@ with _SysPathG(
        Returns:
            model: nn.Layer. Specific `DenseNet121` model depends on args.
        """
+        kwargs.update({'pretrained': pretrained})
        model = backbone.DenseNet121(**kwargs)
-        if pretrained:
-            model = _load_pretrained_parameters(model, 'DenseNet121')

        return model

@ -289,9 +281,8 @@ with _SysPathG(
        Returns:
            model: nn.Layer. Specific `DenseNet161` model depends on args.
        """
+        kwargs.update({'pretrained': pretrained})
        model = backbone.DenseNet161(**kwargs)
-        if pretrained:
-            model = _load_pretrained_parameters(model, 'DenseNet161')

        return model

@ -307,9 +298,8 @@ with _SysPathG(
        Returns:
            model: nn.Layer. Specific `DenseNet169` model depends on args.
        """
+        kwargs.update({'pretrained': pretrained})
        model = backbone.DenseNet169(**kwargs)
-        if pretrained:
-            model = _load_pretrained_parameters(model, 'DenseNet169')

        return model

@ -325,9 +315,8 @@ with _SysPathG(
        Returns:
            model: nn.Layer. Specific `DenseNet201` model depends on args.
        """
+        kwargs.update({'pretrained': pretrained})
        model = backbone.DenseNet201(**kwargs)
-        if pretrained:
-            model = _load_pretrained_parameters(model, 'DenseNet201')

        return model

@ -343,9 +332,8 @@ with _SysPathG(
        Returns:
            model: nn.Layer. Specific `DenseNet264` model depends on args.
        """
+        kwargs.update({'pretrained': pretrained})
        model = backbone.DenseNet264(**kwargs)
-        if pretrained:
-            model = _load_pretrained_parameters(model, 'DenseNet264')

        return model

@ -359,9 +347,8 @@ with _SysPathG(
        Returns:
            model: nn.Layer. Specific `InceptionV3` model depends on args.
        """
+        kwargs.update({'pretrained': pretrained})
        model = backbone.InceptionV3(**kwargs)
-        if pretrained:
-            model = _load_pretrained_parameters(model, 'InceptionV3')

        return model

@ -375,9 +362,8 @@ with _SysPathG(
        Returns:
            model: nn.Layer. Specific `InceptionV4` model depends on args.
        """
+        kwargs.update({'pretrained': pretrained})
        model = backbone.InceptionV4(**kwargs)
-        if pretrained:
-            model = _load_pretrained_parameters(model, 'InceptionV4')

        return model

@ -391,9 +377,8 @@ with _SysPathG(
        Returns:
            model: nn.Layer. Specific `GoogLeNet` model depends on args.
        """
+        kwargs.update({'pretrained': pretrained})
        model = backbone.GoogLeNet(**kwargs)
-        if pretrained:
-            model = _load_pretrained_parameters(model, 'GoogLeNet')

        return model

@ -407,9 +392,8 @@ with _SysPathG(
        Returns:
            model: nn.Layer. Specific `ShuffleNetV2_x0_25` model depends on args.
        """
+        kwargs.update({'pretrained': pretrained})
        model = backbone.ShuffleNetV2_x0_25(**kwargs)
-        if pretrained:
-            model = _load_pretrained_parameters(model, 'ShuffleNetV2_x0_25')

        return model

@ -423,9 +407,8 @@ with _SysPathG(
        Returns:
            model: nn.Layer. Specific `MobileNetV1` model depends on args.
        """
+        kwargs.update({'pretrained': pretrained})
        model = backbone.MobileNetV1(**kwargs)
-        if pretrained:
-            model = _load_pretrained_parameters(model, 'MobileNetV1')

        return model

@ -439,9 +422,8 @@ with _SysPathG(
        Returns:
            model: nn.Layer. Specific `MobileNetV1_x0_25` model depends on args.
        """
+        kwargs.update({'pretrained': pretrained})
        model = backbone.MobileNetV1_x0_25(**kwargs)
-        if pretrained:
-            model = _load_pretrained_parameters(model, 'MobileNetV1_x0_25')

        return model

@ -455,9 +437,8 @@ with _SysPathG(
        Returns:
            model: nn.Layer. Specific `MobileNetV1_x0_5` model depends on args.
        """
+        kwargs.update({'pretrained': pretrained})
        model = backbone.MobileNetV1_x0_5(**kwargs)
-        if pretrained:
-            model = _load_pretrained_parameters(model, 'MobileNetV1_x0_5')

        return model

@ -471,9 +452,8 @@ with _SysPathG(
        Returns:
            model: nn.Layer. Specific `MobileNetV1_x0_75` model depends on args.
        """
+        kwargs.update({'pretrained': pretrained})
        model = backbone.MobileNetV1_x0_75(**kwargs)
-        if pretrained:
-            model = _load_pretrained_parameters(model, 'MobileNetV1_x0_75')

        return model

@ -487,9 +467,8 @@ with _SysPathG(
        Returns:
            model: nn.Layer. Specific `MobileNetV2_x0_25` model depends on args.
        """
+        kwargs.update({'pretrained': pretrained})
        model = backbone.MobileNetV2_x0_25(**kwargs)
-        if pretrained:
-            model = _load_pretrained_parameters(model, 'MobileNetV2_x0_25')

        return model

@ -503,9 +482,8 @@ with _SysPathG(
        Returns:
            model: nn.Layer. Specific `MobileNetV2_x0_5` model depends on args.
        """
+        kwargs.update({'pretrained': pretrained})
        model = backbone.MobileNetV2_x0_5(**kwargs)
-        if pretrained:
-            model = _load_pretrained_parameters(model, 'MobileNetV2_x0_5')

        return model

@ -519,9 +497,8 @@ with _SysPathG(
        Returns:
            model: nn.Layer. Specific `MobileNetV2_x0_75` model depends on args.
        """
+        kwargs.update({'pretrained': pretrained})
        model = backbone.MobileNetV2_x0_75(**kwargs)
-        if pretrained:
-            model = _load_pretrained_parameters(model, 'MobileNetV2_x0_75')

        return model

@ -535,9 +512,8 @@ with _SysPathG(
        Returns:
            model: nn.Layer. Specific `MobileNetV2_x1_5` model depends on args.
        """
+        kwargs.update({'pretrained': pretrained})
        model = backbone.MobileNetV2_x1_5(**kwargs)
-        if pretrained:
-            model = _load_pretrained_parameters(model, 'MobileNetV2_x1_5')

        return model

@ -551,9 +527,8 @@ with _SysPathG(
        Returns:
            model: nn.Layer. Specific `MobileNetV2_x2_0` model depends on args.
        """
+        kwargs.update({'pretrained': pretrained})
        model = backbone.MobileNetV2_x2_0(**kwargs)
-        if pretrained:
-            model = _load_pretrained_parameters(model, 'MobileNetV2_x2_0')

        return model

@ -567,10 +542,8 @@ with _SysPathG(
        Returns:
            model: nn.Layer. Specific `MobileNetV3_large_x0_35` model depends on args.
        """
+        kwargs.update({'pretrained': pretrained})
        model = backbone.MobileNetV3_large_x0_35(**kwargs)
-        if pretrained:
-            model = _load_pretrained_parameters(model,
-                                                'MobileNetV3_large_x0_35')

        return model

@ -584,10 +557,8 @@ with _SysPathG(
        Returns:
            model: nn.Layer. Specific `MobileNetV3_large_x0_5` model depends on args.
        """
+        kwargs.update({'pretrained': pretrained})
        model = backbone.MobileNetV3_large_x0_5(**kwargs)
-        if pretrained:
-            model = _load_pretrained_parameters(model,
-                                                'MobileNetV3_large_x0_5')

        return model

@ -601,10 +572,8 @@ with _SysPathG(
        Returns:
            model: nn.Layer. Specific `MobileNetV3_large_x0_75` model depends on args.
        """
+        kwargs.update({'pretrained': pretrained})
        model = backbone.MobileNetV3_large_x0_75(**kwargs)
-        if pretrained:
-            model = _load_pretrained_parameters(model,
-                                                'MobileNetV3_large_x0_75')

        return model

@ -618,10 +587,8 @@ with _SysPathG(
        Returns:
            model: nn.Layer. Specific `MobileNetV3_large_x1_0` model depends on args.
        """
+        kwargs.update({'pretrained': pretrained})
        model = backbone.MobileNetV3_large_x1_0(**kwargs)
-        if pretrained:
-            model = _load_pretrained_parameters(model,
-                                                'MobileNetV3_large_x1_0')

        return model

@ -635,10 +602,8 @@ with _SysPathG(
        Returns:
            model: nn.Layer. Specific `MobileNetV3_large_x1_25` model depends on args.
        """
+        kwargs.update({'pretrained': pretrained})
        model = backbone.MobileNetV3_large_x1_25(**kwargs)
-        if pretrained:
-            model = _load_pretrained_parameters(model,
-                                                'MobileNetV3_large_x1_25')

        return model

@ -652,10 +617,8 @@ with _SysPathG(
        Returns:
            model: nn.Layer. Specific `MobileNetV3_small_x0_35` model depends on args.
        """
+        kwargs.update({'pretrained': pretrained})
        model = backbone.MobileNetV3_small_x0_35(**kwargs)
-        if pretrained:
-            model = _load_pretrained_parameters(model,
-                                                'MobileNetV3_small_x0_35')

        return model

@ -669,10 +632,8 @@ with _SysPathG(
        Returns:
            model: nn.Layer. Specific `MobileNetV3_small_x0_5` model depends on args.
        """
+        kwargs.update({'pretrained': pretrained})
        model = backbone.MobileNetV3_small_x0_5(**kwargs)
-        if pretrained:
-            model = _load_pretrained_parameters(model,
-                                                'MobileNetV3_small_x0_5')

        return model

@ -686,10 +647,8 @@ with _SysPathG(
        Returns:
            model: nn.Layer. Specific `MobileNetV3_small_x0_75` model depends on args.
        """
+        kwargs.update({'pretrained': pretrained})
        model = backbone.MobileNetV3_small_x0_75(**kwargs)
-        if pretrained:
-            model = _load_pretrained_parameters(model,
-                                                'MobileNetV3_small_x0_75')

        return model

@ -703,10 +662,8 @@ with _SysPathG(
        Returns:
            model: nn.Layer. Specific `MobileNetV3_small_x1_0` model depends on args.
        """
+        kwargs.update({'pretrained': pretrained})
        model = backbone.MobileNetV3_small_x1_0(**kwargs)
-        if pretrained:
-            model = _load_pretrained_parameters(model,
-                                                'MobileNetV3_small_x1_0')

        return model

@ -720,10 +677,8 @@ with _SysPathG(
        Returns:
            model: nn.Layer. Specific `MobileNetV3_small_x1_25` model depends on args.
        """
+        kwargs.update({'pretrained': pretrained})
        model = backbone.MobileNetV3_small_x1_25(**kwargs)
-        if pretrained:
-            model = _load_pretrained_parameters(model,
-                                                'MobileNetV3_small_x1_25')

        return model

@ -737,9 +692,8 @@ with _SysPathG(
        Returns:
            model: nn.Layer. Specific `ResNeXt101_32x4d` model depends on args.
        """
+        kwargs.update({'pretrained': pretrained})
        model = backbone.ResNeXt101_32x4d(**kwargs)
-        if pretrained:
-            model = _load_pretrained_parameters(model, 'ResNeXt101_32x4d')

        return model

@ -753,9 +707,8 @@ with _SysPathG(
        Returns:
            model: nn.Layer. Specific `ResNeXt101_64x4d` model depends on args.
        """
+        kwargs.update({'pretrained': pretrained})
        model = backbone.ResNeXt101_64x4d(**kwargs)
-        if pretrained:
-            model = _load_pretrained_parameters(model, 'ResNeXt101_64x4d')

        return model

@ -769,9 +722,8 @@ with _SysPathG(
        Returns:
            model: nn.Layer. Specific `ResNeXt152_32x4d` model depends on args.
        """
+        kwargs.update({'pretrained': pretrained})
        model = backbone.ResNeXt152_32x4d(**kwargs)
-        if pretrained:
-            model = _load_pretrained_parameters(model, 'ResNeXt152_32x4d')

        return model

@ -785,9 +737,8 @@ with _SysPathG(
        Returns:
            model: nn.Layer. Specific `ResNeXt152_64x4d` model depends on args.
        """
+        kwargs.update({'pretrained': pretrained})
        model = backbone.ResNeXt152_64x4d(**kwargs)
-        if pretrained:
-            model = _load_pretrained_parameters(model, 'ResNeXt152_64x4d')

        return model

@ -801,9 +752,8 @@ with _SysPathG(
        Returns:
            model: nn.Layer. Specific `ResNeXt50_32x4d` model depends on args.
        """
+        kwargs.update({'pretrained': pretrained})
        model = backbone.ResNeXt50_32x4d(**kwargs)
-        if pretrained:
-            model = _load_pretrained_parameters(model, 'ResNeXt50_32x4d')

        return model

@ -817,9 +767,8 @@ with _SysPathG(
        Returns:
            model: nn.Layer. Specific `ResNeXt50_64x4d` model depends on args.
        """
+        kwargs.update({'pretrained': pretrained})
        model = backbone.ResNeXt50_64x4d(**kwargs)
-        if pretrained:
-            model = _load_pretrained_parameters(model, 'ResNeXt50_64x4d')

        return model

@ -833,8 +782,7 @@ with _SysPathG(
        Returns:
            model: nn.Layer. Specific `ResNeXt50_64x4d` model depends on args.
        """
+        kwargs.update({'pretrained': pretrained})
        model = backbone.DarkNet53(**kwargs)
-        if pretrained:
-            model = _load_pretrained_parameters(model, 'DarkNet53')

        return model
--- a/ppcls/arch/backbone/init.py
+++ b/ppcls/arch/backbone/init.py
@ -58,6 +58,7 @@ from ppcls.arch.backbone.model_zoo.rednet import RedNet26, RedNet38, RedNet50, R
 from ppcls.arch.backbone.model_zoo.tnt import TNT_small
 from ppcls.arch.backbone.model_zoo.hardnet import HarDNet68, HarDNet85, HarDNet39_ds, HarDNet68_ds
 from ppcls.arch.backbone.variant_models.resnet_variant import ResNet50_last_stage_stride1
+from ppcls.arch.backbone.variant_models.vgg_variant import VGG19Sigmoid


 def get_apis():
--- a/ppcls/arch/backbone/model_zoo/swin_transformer.py
+++ b/ppcls/arch/backbone/model_zoo/swin_transformer.py
@ -33,9 +33,9 @@ MODEL_URLS = {
    "SwinTransformer_base_patch4_window12_384":
    "https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/SwinTransformer_base_patch4_window12_384_pretrained.pdparams",
    "SwinTransformer_large_patch4_window7_224":
-    "https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/SwinTransformer_large_patch4_window7_224_pretrained.pdparams",
+    "https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/SwinTransformer_large_patch4_window7_224_22kto1k_pretrained.pdparams",
    "SwinTransformer_large_patch4_window12_384":
-    "https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/SwinTransformer_large_patch4_window12_384_pretrained.pdparams",
+    "https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/SwinTransformer_large_patch4_window12_384_22kto1k_pretrained.pdparams",
 }

 __all__ = list(MODEL_URLS.keys())
--- a/ppcls/arch/backbone/variant_models/init.py
+++ b/ppcls/arch/backbone/variant_models/init.py
@ -1 +1,2 @@
 from .resnet_variant import ResNet50_last_stage_stride1
+from .vgg_variant import VGG19Sigmoid
--- a/ppcls/arch/backbone/variant_models/vgg_variant.py
+++ b/ppcls/arch/backbone/variant_models/vgg_variant.py
@ -0,0 +1,28 @@
+import paddle
+from paddle.nn import Sigmoid
+from ppcls.arch.backbone.legendary_models.vgg import VGG19
+ 
+__all__ = ["VGG19Sigmoid"]
+ 
+ 
+class SigmoidSuffix(paddle.nn.Layer):
+    def __init__(self, origin_layer):
+        super(SigmoidSuffix, self).__init__()
+        self.origin_layer = origin_layer
+        self.sigmoid = Sigmoid()
+ 
+    def forward(self, input, res_dict=None, **kwargs):
+        x = self.origin_layer(input)
+        x = self.sigmoid(x)
+        return x
+ 
+ 
+def VGG19Sigmoid(pretrained=False, use_ssld=False, **kwargs):
+    def replace_function(origin_layer):
+        new_layer = SigmoidSuffix(origin_layer)
+        return new_layer
+ 
+    match_re = "linear_2"
+    model = VGG19(pretrained=pretrained, use_ssld=use_ssld, **kwargs)
+    model.replace_sub(match_re, replace_function, True)
+    return model
--- a/ppcls/arch/gears/circlemargin.py
+++ b/ppcls/arch/gears/circlemargin.py
@ -28,7 +28,7 @@ class CircleMargin(nn.Layer):

        weight_attr = paddle.ParamAttr(
            initializer=paddle.nn.initializer.XavierNormal())
-        self.fc0 = paddle.nn.Linear(
+        self.fc = paddle.nn.Linear(
            self.embedding_size, self.class_num, weight_attr=weight_attr)

    def forward(self, input, label):
@ -36,19 +36,22 @@ class CircleMargin(nn.Layer):
            paddle.sum(paddle.square(input), axis=1, keepdim=True))
        input = paddle.divide(input, feat_norm)

-        weight = self.fc0.weight
+        weight = self.fc.weight
        weight_norm = paddle.sqrt(
            paddle.sum(paddle.square(weight), axis=0, keepdim=True))
        weight = paddle.divide(weight, weight_norm)

        logits = paddle.matmul(input, weight)
+        if not self.training or label is None:
+            return logits

        alpha_p = paddle.clip(-logits.detach() + 1 + self.margin, min=0.)
        alpha_n = paddle.clip(logits.detach() + self.margin, min=0.)
        delta_p = 1 - self.margin
        delta_n = self.margin
-        index = paddle.fluid.layers.where(label != -1).reshape([-1])
+        
        m_hot = F.one_hot(label.reshape([-1]), num_classes=logits.shape[1])
+        
        logits_p = alpha_p * (logits - delta_p)
        logits_n = alpha_n * (logits - delta_n)
        pre_logits = logits_p * m_hot + logits_n * (1 - m_hot)
--- a/ppcls/arch/gears/cosmargin.py
+++ b/ppcls/arch/gears/cosmargin.py
@ -46,6 +46,9 @@ class CosMargin(paddle.nn.Layer):
        weight = paddle.divide(weight, weight_norm)

        cos = paddle.matmul(input, weight)
+        if not self.training or label is None:
+            return cos
+        
        cos_m = cos - self.margin

        one_hot = paddle.nn.functional.one_hot(label, self.class_num)
--- a/ppcls/configs/Products/MV3_Large_1x_Aliproduct_DLBHC.yaml
+++ b/ppcls/configs/Products/MV3_Large_1x_Aliproduct_DLBHC.yaml
@ -0,0 +1,149 @@
+# global configs
+Global:
+  checkpoints: null
+  pretrained_model: null
+  output_dir: ./output_dlbhc/
+  device: gpu
+  save_interval: 1
+  eval_during_train: True
+  eval_interval: 1
+  epochs: 100
+  #eval_mode: "retrieval"
+  print_batch_step: 10
+  use_visualdl: False
+
+  # used for static mode and model export
+  image_shape: [3, 224, 224]
+  save_inference_dir: ./inference
+
+  #feature postprocess
+  feature_normalize: False
+  feature_binarize: "round"
+
+# model architecture
+Arch:  
+  name: "RecModel"
+  Backbone:
+    name: "MobileNetV3_large_x1_0"
+    pretrained: True
+    class_num: 512
+  Head:
+    name: "FC"
+    class_num: 50030
+    embedding_size: 512
+    
+  infer_output_key:  "features"
+  infer_add_softmax: "false"
+ 
+# loss function config for train/eval process
+Loss:
+  Train:
+    - CELoss:
+        weight: 1.0
+        epsilon: 0.1
+  Eval:
+    - CELoss:
+        weight: 1.0
+
+Optimizer:
+  name: Momentum
+  momentum: 0.9
+  lr:
+    name: Piecewise
+    learning_rate: 0.1
+    decay_epochs: [50, 150]
+    values: [0.1, 0.01, 0.001]
+
+# data loader for train and eval
+DataLoader:
+  Train:
+    dataset:
+      name: ImageNetDataset
+      image_root: ./dataset/Aliproduct/
+      cls_label_path: ./dataset/Aliproduct/train_list.txt
+      transform_ops:
+        - DecodeImage:
+            to_rgb: True
+            channel_first: False
+        - ResizeImage:
+            size: 256
+        - RandCropImage:
+            size: 227
+        - RandFlipImage:
+            flip_code: 1
+        - NormalizeImage:
+            scale: 1.0/255.0
+            mean: [0.4914, 0.4822, 0.4465]
+            std: [0.2023, 0.1994, 0.2010]
+            order: ''
+    sampler:
+      name: DistributedBatchSampler
+      batch_size: 128
+      drop_last: False
+      shuffle: True
+    loader:
+      num_workers: 4
+      use_shared_memory: True
+
+  Eval:
+    dataset: 
+      name: ImageNetDataset
+      image_root: ./dataset/Aliproduct/
+      cls_label_path: ./dataset/Aliproduct/val_list.txt
+      transform_ops:
+        - DecodeImage:
+            to_rgb: True
+            channel_first: False
+        - ResizeImage:
+            size: 227
+        - NormalizeImage:
+            scale: 1.0/255.0
+            mean: [0.4914, 0.4822, 0.4465]
+            std: [0.2023, 0.1994, 0.2010]
+            order: ''
+    sampler:
+      name: DistributedBatchSampler
+      batch_size: 256
+      drop_last: False
+      shuffle: False
+    loader:
+      num_workers: 4
+      use_shared_memory: True
+
+Infer:
+  infer_imgs: docs/images/whl/demo.jpg
+  batch_size: 10
+  transforms:
+    - DecodeImage:
+        to_rgb: True
+        channel_first: False
+    - ResizeImage:
+        resize_short: 256
+    - CropImage:
+        size: 227
+    - NormalizeImage:
+        scale: 1.0/255.0
+        mean: [0.4914, 0.4822, 0.4465]
+        std: [0.2023, 0.1994, 0.2010]
+        order: ''
+    - ToCHWImage:
+  PostProcess:
+    name: Topk
+    topk: 5
+    class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
+
+Metric:
+  Train:
+    - TopkAcc:
+        topk: [1, 5]
+  Eval:
+    - TopkAcc:
+        topk: [1, 5]
+        
+# switch to metric below when eval by retrieval
+#     - Recallk:
+#         topk: [1]
+#     - mAP:
+#     - Precisionk:
+#         topk: [1]
+
--- a/ppcls/configs/quick_start/professional/VGG19_CIFAR10_DeepHash.yaml
+++ b/ppcls/configs/quick_start/professional/VGG19_CIFAR10_DeepHash.yaml
@ -0,0 +1,147 @@
+# global configs
+Global:
+  checkpoints: null
+  pretrained_model: null
+  output_dir: ./output
+  device: gpu
+  save_interval: 1
+  eval_during_train: True
+  eval_interval: 1
+  eval_mode: "retrieval"
+  epochs: 128
+  print_batch_step: 10
+  use_visualdl: False
+
+  # used for static mode and model export
+  image_shape: [3, 224, 224]
+  save_inference_dir: ./inference
+
+  #feature postprocess
+  feature_normalize: False
+  feature_binarize: "round"
+
+# model architecture
+Arch:
+  name: "RecModel"
+  Backbone:
+    name: "VGG19Sigmoid"
+    pretrained: True
+    class_num: 48
+  Head:
+    name: "FC"
+    class_num: 10
+    embedding_size: 48
+    
+  infer_output_key:  "features"
+  infer_add_softmax: "false"
+
+# loss function config for train/eval process
+Loss:
+  Train:
+    - CELoss:
+        weight: 1.0
+        epsilon: 0.1
+  Eval:
+    - CELoss:
+        weight: 1.0
+
+Optimizer:
+  name: Momentum
+  momentum: 0.9
+  lr:
+    name: Piecewise
+    learning_rate: 0.01
+    decay_epochs: [200]
+    values: [0.01, 0.001]
+
+# data loader for train and eval
+DataLoader:
+  Train:
+    dataset:
+      name: ImageNetDataset
+      image_root: ./dataset/cifar10/
+      cls_label_path: ./dataset/cifar10/cifar10-2/train.txt
+      transform_ops:
+        - DecodeImage:
+            to_rgb: True
+            channel_first: False
+        - ResizeImage:
+            size: 256
+        - RandCropImage:
+            size: 224
+        - RandFlipImage:
+            flip_code: 1
+        - NormalizeImage:
+            scale: 1.0/255.0
+            mean: [0.4914, 0.4822, 0.4465]
+            std: [0.2023, 0.1994, 0.2010]
+            order: ''
+    sampler:
+      name: DistributedBatchSampler
+      batch_size: 128
+      drop_last: False
+      shuffle: True
+    loader:
+      num_workers: 4
+      use_shared_memory: True
+
+  Eval:
+    Query:
+      dataset: 
+        name: ImageNetDataset
+        image_root: ./dataset/cifar10/
+        cls_label_path: ./dataset/cifar10/cifar10-2/test.txt
+        transform_ops:
+          - DecodeImage:
+              to_rgb: True
+              channel_first: False
+          - ResizeImage:
+              size: 224
+          - NormalizeImage:
+              scale: 1.0/255.0
+              mean: [0.4914, 0.4822, 0.4465]
+              std: [0.2023, 0.1994, 0.2010]
+              order: ''
+      sampler:
+        name: DistributedBatchSampler
+        batch_size: 512
+        drop_last: False
+        shuffle: False
+      loader:
+        num_workers: 4
+        use_shared_memory: True
+
+    Gallery:
+      dataset: 
+        name: ImageNetDataset
+        image_root: ./dataset/cifar10/
+        cls_label_path: ./dataset/cifar10/cifar10-2/database.txt
+        transform_ops:
+          - DecodeImage:
+              to_rgb: True
+              channel_first: False
+          - ResizeImage:
+              size: 224
+          - NormalizeImage:
+              scale: 1.0/255.0
+              mean: [0.4914, 0.4822, 0.4465]
+              std: [0.2023, 0.1994, 0.2010]
+              order: ''
+      sampler:
+        name: DistributedBatchSampler
+        batch_size: 512
+        drop_last: False
+        shuffle: False
+      loader:
+        num_workers: 4
+        use_shared_memory: True
+
+Metric:
+  Train:
+    - TopkAcc:
+        topk: [1, 5]
+  Eval:
+    - mAP:
+    - Precisionk:
+        topk: [1, 5]
+        
--- a/ppcls/data/preprocess/ops/random_erasing.py
+++ b/ppcls/data/preprocess/ops/random_erasing.py
@ -42,9 +42,9 @@ class RandomErasing(object):
            h = int(round(math.sqrt(target_area * aspect_ratio)))
            w = int(round(math.sqrt(target_area / aspect_ratio)))

-            if w < img.shape[2] and h < img.shape[1]:
-                x1 = random.randint(0, img.shape[1] - h)
-                y1 = random.randint(0, img.shape[2] - w)
+            if w < img.shape[1] and h < img.shape[0]:
+                x1 = random.randint(0, img.shape[0] - h)
+                y1 = random.randint(0, img.shape[1] - w)
                if img.shape[0] == 3:
                    img[x1:x1 + h, y1:y1 + w, 0] = self.mean[0]
                    img[x1:x1 + h, y1:y1 + w, 1] = self.mean[1]
--- a/ppcls/engine/engine.py
+++ b/ppcls/engine/engine.py
@ -0,0 +1,391 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import os
+import platform
+import paddle
+import paddle.distributed as dist
+from visualdl import LogWriter
+from paddle import nn
+
+from ppcls.utils.check import check_gpu
+from ppcls.utils.misc import AverageMeter
+from ppcls.utils import logger
+from ppcls.utils.logger import init_logger
+from ppcls.utils.config import print_config
+from ppcls.data import build_dataloader
+from ppcls.arch import build_model, RecModel, DistillationModel
+from ppcls.arch import apply_to_static
+from ppcls.loss import build_loss
+from ppcls.metric import build_metrics
+from ppcls.optimizer import build_optimizer
+from ppcls.utils.save_load import load_dygraph_pretrain, load_dygraph_pretrain_from_url
+from ppcls.utils.save_load import init_model
+from ppcls.utils import save_load
+
+from ppcls.data.utils.get_image_list import get_image_list
+from ppcls.data.postprocess import build_postprocess
+from ppcls.data import create_operators
+from ppcls.engine.train import train_epoch
+from ppcls.engine import evaluation
+from ppcls.arch.gears.identity_head import IdentityHead
+
+
+class Engine(object):
+    def __init__(self, config, mode="train"):
+        assert mode in ["train", "eval", "infer", "export"]
+        self.mode = mode
+        self.config = config
+        self.eval_mode = self.config["Global"].get("eval_mode",
+                                                   "classification")
+        # init logger
+        self.output_dir = self.config['Global']['output_dir']
+        log_file = os.path.join(self.output_dir, self.config["Arch"]["name"],
+                                f"{mode}.log")
+        init_logger(name='root', log_file=log_file)
+        print_config(config)
+
+        # init train_func and eval_func
+        assert self.eval_mode in ["classification", "retrieval"], logger.error(
+            "Invalid eval mode: {}".format(self.eval_mode))
+        self.train_epoch_func = train_epoch
+        self.eval_func = getattr(evaluation, self.eval_mode + "_eval")
+
+        self.use_dali = self.config['Global'].get("use_dali", False)
+
+        # for visualdl
+        self.vdl_writer = None
+        if self.config['Global']['use_visualdl'] and mode == "train":
+            vdl_writer_path = os.path.join(self.output_dir, "vdl")
+            if not os.path.exists(vdl_writer_path):
+                os.makedirs(vdl_writer_path)
+            self.vdl_writer = LogWriter(logdir=vdl_writer_path)
+
+        # set device
+        assert self.config["Global"]["device"] in ["cpu", "gpu", "xpu"]
+        self.device = paddle.set_device(self.config["Global"]["device"])
+        logger.info('train with paddle {} and device {}'.format(
+            paddle.__version__, self.device))
+
+        # AMP training
+        self.amp = True if "AMP" in self.config else False
+        if self.amp and self.config["AMP"] is not None:
+            self.scale_loss = self.config["AMP"].get("scale_loss", 1.0)
+            self.use_dynamic_loss_scaling = self.config["AMP"].get(
+                "use_dynamic_loss_scaling", False)
+        else:
+            self.scale_loss = 1.0
+            self.use_dynamic_loss_scaling = False
+        if self.amp:
+            AMP_RELATED_FLAGS_SETTING = {
+                'FLAGS_cudnn_batchnorm_spatial_persistent': 1,
+                'FLAGS_max_inplace_grad_add': 8,
+            }
+            paddle.fluid.set_flags(AMP_RELATED_FLAGS_SETTING)
+
+        # build dataloader
+        if self.mode == 'train':
+            self.train_dataloader = build_dataloader(
+                self.config["DataLoader"], "Train", self.device, self.use_dali)
+        if self.mode in ["train", "eval"]:
+            if self.eval_mode == "classification":
+                self.eval_dataloader = build_dataloader(
+                    self.config["DataLoader"], "Eval", self.device,
+                    self.use_dali)
+            elif self.eval_mode == "retrieval":
+                self.gallery_dataloader = build_dataloader(
+                    self.config["DataLoader"]["Eval"], "Gallery", self.device,
+                    self.use_dali)
+                self.query_dataloader = build_dataloader(
+                    self.config["DataLoader"]["Eval"], "Query", self.device,
+                    self.use_dali)
+
+        # build loss
+        if self.mode == "train":
+            loss_info = self.config["Loss"]["Train"]
+            self.train_loss_func = build_loss(loss_info)
+        if self.mode in ["train", "eval"]:
+            loss_config = self.config.get("Loss", None)
+            if loss_config is not None:
+                loss_config = loss_config.get("Eval")
+                if loss_config is not None:
+                    self.eval_loss_func = build_loss(loss_config)
+                else:
+                    self.eval_loss_func = None
+            else:
+                self.eval_loss_func = None
+
+        # build metric
+        if self.mode == 'train':
+            metric_config = self.config.get("Metric")
+            if metric_config is not None:
+                metric_config = metric_config.get("Train")
+                if metric_config is not None:
+                    self.train_metric_func = build_metrics(metric_config)
+                else:
+                    self.train_metric_func = None
+        else:
+            self.train_metric_func = None
+
+        if self.mode in ["train", "eval"]:
+            metric_config = self.config.get("Metric")
+            if self.eval_mode == "classification":
+                if metric_config is not None:
+                    metric_config = metric_config.get("Eval")
+                    if metric_config is not None:
+                        self.eval_metric_func = build_metrics(metric_config)
+            elif self.eval_mode == "retrieval":
+                if metric_config is None:
+                    metric_config = [{"name": "Recallk", "topk": (1, 5)}]
+                else:
+                    metric_config = metric_config["Eval"]
+                self.eval_metric_func = build_metrics(metric_config)
+        else:
+            self.eval_metric_func = None
+
+        # build model
+        self.model = build_model(self.config["Arch"])
+        # set @to_static for benchmark, skip this by default.
+        apply_to_static(self.config, self.model)
+        # load_pretrain
+        if self.config["Global"]["pretrained_model"] is not None:
+            if self.config["Global"]["pretrained_model"].startswith("http"):
+                load_dygraph_pretrain_from_url(
+                    self.model, self.config["Global"]["pretrained_model"])
+            else:
+                load_dygraph_pretrain(
+                    self.model, self.config["Global"]["pretrained_model"])
+
+        # for slim
+
+        # build optimizer
+        if self.mode == 'train':
+            self.optimizer, self.lr_sch = build_optimizer(
+                self.config["Optimizer"], self.config["Global"]["epochs"],
+                len(self.train_dataloader), self.model.parameters())
+
+        # for distributed
+        self.config["Global"][
+            "distributed"] = paddle.distributed.get_world_size() != 1
+        if self.config["Global"]["distributed"]:
+            dist.init_parallel_env()
+        if self.config["Global"]["distributed"]:
+            self.model = paddle.DataParallel(self.model)
+
+        # build postprocess for infer
+        if self.mode == 'infer':
+            self.preprocess_func = create_operators(self.config["Infer"][
+                "transforms"])
+            self.postprocess_func = build_postprocess(self.config["Infer"][
+                "PostProcess"])
+
+    def train(self):
+        assert self.mode == "train"
+        print_batch_step = self.config['Global']['print_batch_step']
+        save_interval = self.config["Global"]["save_interval"]
+        best_metric = {
+            "metric": 0.0,
+            "epoch": 0,
+        }
+        # key:
+        # val: metrics list word
+        self.output_info = dict()
+        self.time_info = {
+            "batch_cost": AverageMeter(
+                "batch_cost", '.5f', postfix=" s,"),
+            "reader_cost": AverageMeter(
+                "reader_cost", ".5f", postfix=" s,"),
+        }
+        # global iter counter
+        self.global_step = 0
+
+        if self.config["Global"]["checkpoints"] is not None:
+            metric_info = init_model(self.config["Global"], self.model,
+                                     self.optimizer)
+            if metric_info is not None:
+                best_metric.update(metric_info)
+
+        # for amp training
+        if self.amp:
+            self.scaler = paddle.amp.GradScaler(
+                init_loss_scaling=self.scale_loss,
+                use_dynamic_loss_scaling=self.use_dynamic_loss_scaling)
+
+        self.max_iter = len(self.train_dataloader) - 1 if platform.system(
+        ) == "Windows" else len(self.train_dataloader)
+        for epoch_id in range(best_metric["epoch"] + 1,
+                              self.config["Global"]["epochs"] + 1):
+            acc = 0.0
+            # for one epoch train
+            self.train_epoch_func(self, epoch_id, print_batch_step)
+
+            if self.use_dali:
+                self.train_dataloader.reset()
+            metric_msg = ", ".join([
+                "{}: {:.5f}".format(key, self.output_info[key].avg)
+                for key in self.output_info
+            ])
+            logger.info("[Train][Epoch {}/{}][Avg]{}".format(
+                epoch_id, self.config["Global"]["epochs"], metric_msg))
+            self.output_info.clear()
+
+            # eval model and save model if possible
+            if self.config["Global"][
+                    "eval_during_train"] and epoch_id % self.config["Global"][
+                        "eval_interval"] == 0:
+                acc = self.eval(epoch_id)
+                if acc > best_metric["metric"]:
+                    best_metric["metric"] = acc
+                    best_metric["epoch"] = epoch_id
+                    save_load.save_model(
+                        self.model,
+                        self.optimizer,
+                        best_metric,
+                        self.output_dir,
+                        model_name=self.config["Arch"]["name"],
+                        prefix="best_model")
+                logger.info("[Eval][Epoch {}][best metric: {}]".format(
+                    epoch_id, best_metric["metric"]))
+                logger.scaler(
+                    name="eval_acc",
+                    value=acc,
+                    step=epoch_id,
+                    writer=self.vdl_writer)
+
+                self.model.train()
+
+            # save model
+            if epoch_id % save_interval == 0:
+                save_load.save_model(
+                    self.model,
+                    self.optimizer, {"metric": acc,
+                                     "epoch": epoch_id},
+                    self.output_dir,
+                    model_name=self.config["Arch"]["name"],
+                    prefix="epoch_{}".format(epoch_id))
+                # save the latest model
+                save_load.save_model(
+                    self.model,
+                    self.optimizer, {"metric": acc,
+                                     "epoch": epoch_id},
+                    self.output_dir,
+                    model_name=self.config["Arch"]["name"],
+                    prefix="latest")
+
+        if self.vdl_writer is not None:
+            self.vdl_writer.close()
+
+    @paddle.no_grad()
+    def eval(self, epoch_id=0):
+        assert self.mode in ["train", "eval"]
+        self.model.eval()
+        eval_result = self.eval_func(self, epoch_id)
+        self.model.train()
+        return eval_result
+
+    @paddle.no_grad()
+    def infer(self):
+        assert self.mode == "infer" and self.eval_mode == "classification"
+        total_trainer = paddle.distributed.get_world_size()
+        local_rank = paddle.distributed.get_rank()
+        image_list = get_image_list(self.config["Infer"]["infer_imgs"])
+        # data split
+        image_list = image_list[local_rank::total_trainer]
+
+        batch_size = self.config["Infer"]["batch_size"]
+        self.model.eval()
+        batch_data = []
+        image_file_list = []
+        for idx, image_file in enumerate(image_list):
+            with open(image_file, 'rb') as f:
+                x = f.read()
+            for process in self.preprocess_func:
+                x = process(x)
+            batch_data.append(x)
+            image_file_list.append(image_file)
+            if len(batch_data) >= batch_size or idx == len(image_list) - 1:
+                batch_tensor = paddle.to_tensor(batch_data)
+                out = self.model(batch_tensor)
+                if isinstance(out, list):
+                    out = out[0]
+                result = self.postprocess_func(out, image_file_list)
+                print(result)
+                batch_data.clear()
+                image_file_list.clear()
+
+    def export(self):
+        assert self.mode == "export"
+        model = ExportModel(self.config["Arch"], self.model)
+        if self.config["Global"]["pretrained_model"] is not None:
+            load_dygraph_pretrain(model.base_model,
+                                  self.config["Global"]["pretrained_model"])
+
+        model.eval()
+
+        model = paddle.jit.to_static(
+            model,
+            input_spec=[
+                paddle.static.InputSpec(
+                    shape=[None] + self.config["Global"]["image_shape"],
+                    dtype='float32')
+            ])
+        paddle.jit.save(
+            model,
+            os.path.join(self.config["Global"]["save_inference_dir"],
+                         "inference"))
+
+
+class ExportModel(nn.Layer):
+    """
+    ExportModel: add softmax onto the model
+    """
+
+    def __init__(self, config, model):
+        super().__init__()
+        self.base_model = model
+
+        # we should choose a final model to export
+        if isinstance(self.base_model, DistillationModel):
+            self.infer_model_name = config["infer_model_name"]
+        else:
+            self.infer_model_name = None
+
+        self.infer_output_key = config.get("infer_output_key", None)
+        if self.infer_output_key == "features" and isinstance(self.base_model,
+                                                              RecModel):
+            self.base_model.head = IdentityHead()
+        if config.get("infer_add_softmax", True):
+            self.softmax = nn.Softmax(axis=-1)
+        else:
+            self.softmax = None
+
+    def eval(self):
+        self.training = False
+        for layer in self.sublayers():
+            layer.training = False
+            layer.eval()
+
+    def forward(self, x):
+        x = self.base_model(x)
+        if isinstance(x, list):
+            x = x[0]
+        if self.infer_model_name is not None:
+            x = x[self.infer_model_name]
+        if self.infer_output_key is not None:
+            x = x[self.infer_output_key]
+        if self.softmax is not None:
+            x = self.softmax(x)
+        return x
--- a/ppcls/engine/evaluation/init.py
+++ b/ppcls/engine/evaluation/init.py
@ -0,0 +1,16 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from ppcls.engine.evaluation.classification import classification_eval
+from ppcls.engine.evaluation.retrieval import retrieval_eval
--- a/ppcls/engine/evaluation/classification.py
+++ b/ppcls/engine/evaluation/classification.py
@ -0,0 +1,114 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+import time
+import platform
+import paddle
+
+from ppcls.utils.misc import AverageMeter
+from ppcls.utils import logger
+
+
+def classification_eval(evaler, epoch_id=0):
+    output_info = dict()
+    time_info = {
+        "batch_cost": AverageMeter(
+            "batch_cost", '.5f', postfix=" s,"),
+        "reader_cost": AverageMeter(
+            "reader_cost", ".5f", postfix=" s,"),
+    }
+    print_batch_step = evaler.config["Global"]["print_batch_step"]
+
+    metric_key = None
+    tic = time.time()
+    eval_dataloader = evaler.eval_dataloader if evaler.use_dali else evaler.eval_dataloader(
+    )
+    max_iter = len(evaler.eval_dataloader) - 1 if platform.system(
+    ) == "Windows" else len(evaler.eval_dataloader)
+    for iter_id, batch in enumerate(eval_dataloader):
+        if iter_id >= max_iter:
+            break
+        if iter_id == 5:
+            for key in time_info:
+                time_info[key].reset()
+        if evaler.use_dali:
+            batch = [
+                paddle.to_tensor(batch[0]['data']),
+                paddle.to_tensor(batch[0]['label'])
+            ]
+        time_info["reader_cost"].update(time.time() - tic)
+        batch_size = batch[0].shape[0]
+        batch[0] = paddle.to_tensor(batch[0]).astype("float32")
+        batch[1] = batch[1].reshape([-1, 1]).astype("int64")
+        # image input
+        out = evaler.model(batch[0])
+        # calc loss
+        if evaler.eval_loss_func is not None:
+            loss_dict = evaler.eval_loss_func(out, batch[1])
+            for key in loss_dict:
+                if key not in output_info:
+                    output_info[key] = AverageMeter(key, '7.5f')
+                output_info[key].update(loss_dict[key].numpy()[0], batch_size)
+        # calc metric
+        if evaler.eval_metric_func is not None:
+            metric_dict = evaler.eval_metric_func(out, batch[1])
+            if paddle.distributed.get_world_size() > 1:
+                for key in metric_dict:
+                    paddle.distributed.all_reduce(
+                        metric_dict[key], op=paddle.distributed.ReduceOp.SUM)
+                    metric_dict[key] = metric_dict[
+                        key] / paddle.distributed.get_world_size()
+            for key in metric_dict:
+                if metric_key is None:
+                    metric_key = key
+                if key not in output_info:
+                    output_info[key] = AverageMeter(key, '7.5f')
+
+                output_info[key].update(metric_dict[key].numpy()[0],
+                                        batch_size)
+
+        time_info["batch_cost"].update(time.time() - tic)
+
+        if iter_id % print_batch_step == 0:
+            time_msg = "s, ".join([
+                "{}: {:.5f}".format(key, time_info[key].avg)
+                for key in time_info
+            ])
+
+            ips_msg = "ips: {:.5f} images/sec".format(
+                batch_size / time_info["batch_cost"].avg)
+
+            metric_msg = ", ".join([
+                "{}: {:.5f}".format(key, output_info[key].val)
+                for key in output_info
+            ])
+            logger.info("[Eval][Epoch {}][Iter: {}/{}]{}, {}, {}".format(
+                epoch_id, iter_id,
+                len(evaler.eval_dataloader), metric_msg, time_msg, ips_msg))
+
+        tic = time.time()
+    if evaler.use_dali:
+        evaler.eval_dataloader.reset()
+    metric_msg = ", ".join([
+        "{}: {:.5f}".format(key, output_info[key].avg) for key in output_info
+    ])
+    logger.info("[Eval][Epoch {}][Avg]{}".format(epoch_id, metric_msg))
+
+    # do not try to save best eval.model
+    if evaler.eval_metric_func is None:
+        return -1
+    # return 1st metric in the dict
+    return output_info[metric_key].avg
--- a/ppcls/engine/evaluation/retrieval.py
+++ b/ppcls/engine/evaluation/retrieval.py
@ -0,0 +1,163 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import platform
+import paddle
+from ppcls.utils import logger
+
+
+def retrieval_eval(evaler, epoch_id=0):
+    evaler.model.eval()
+    # step1. build gallery
+    gallery_feas, gallery_img_id, gallery_unique_id = cal_feature(
+        evaler, name='gallery')
+    query_feas, query_img_id, query_query_id = cal_feature(
+        evaler, name='query')
+
+    # step2. do evaluation
+    sim_block_size = evaler.config["Global"].get("sim_block_size", 64)
+    sections = [sim_block_size] * (len(query_feas) // sim_block_size)
+    if len(query_feas) % sim_block_size:
+        sections.append(len(query_feas) % sim_block_size)
+    fea_blocks = paddle.split(query_feas, num_or_sections=sections)
+    if query_query_id is not None:
+        query_id_blocks = paddle.split(
+            query_query_id, num_or_sections=sections)
+    image_id_blocks = paddle.split(query_img_id, num_or_sections=sections)
+    metric_key = None
+
+    if evaler.eval_loss_func is None:
+        metric_dict = {metric_key: 0.}
+    else:
+        metric_dict = dict()
+        for block_idx, block_fea in enumerate(fea_blocks):
+            similarity_matrix = paddle.matmul(
+                block_fea, gallery_feas, transpose_y=True)
+            if query_query_id is not None:
+                query_id_block = query_id_blocks[block_idx]
+                query_id_mask = (query_id_block != gallery_unique_id.t())
+
+                image_id_block = image_id_blocks[block_idx]
+                image_id_mask = (image_id_block != gallery_img_id.t())
+
+                keep_mask = paddle.logical_or(query_id_mask, image_id_mask)
+                similarity_matrix = similarity_matrix * keep_mask.astype(
+                    "float32")
+            else:
+                keep_mask = None
+
+            metric_tmp = evaler.eval_metric_func(similarity_matrix,
+                                                 image_id_blocks[block_idx],
+                                                 gallery_img_id, keep_mask)
+
+            for key in metric_tmp:
+                if key not in metric_dict:
+                    metric_dict[key] = metric_tmp[key] * block_fea.shape[
+                        0] / len(query_feas)
+                else:
+                    metric_dict[key] += metric_tmp[key] * block_fea.shape[
+                        0] / len(query_feas)
+
+    metric_info_list = []
+    for key in metric_dict:
+        if metric_key is None:
+            metric_key = key
+        metric_info_list.append("{}: {:.5f}".format(key, metric_dict[key]))
+    metric_msg = ", ".join(metric_info_list)
+    logger.info("[Eval][Epoch {}][Avg]{}".format(epoch_id, metric_msg))
+
+    return metric_dict[metric_key]
+
+
+def cal_feature(evaler, name='gallery'):
+    all_feas = None
+    all_image_id = None
+    all_unique_id = None
+    has_unique_id = False
+
+    if name == 'gallery':
+        dataloader = evaler.gallery_dataloader
+    elif name == 'query':
+        dataloader = evaler.query_dataloader
+    else:
+        raise RuntimeError("Only support gallery or query dataset")
+
+    max_iter = len(dataloader) - 1 if platform.system() == "Windows" else len(
+        dataloader)
+    dataloader_tmp = dataloader if evaler.use_dali else dataloader()
+    for idx, batch in enumerate(dataloader_tmp):  # load is very time-consuming
+        if idx >= max_iter:
+            break
+        if idx % evaler.config["Global"]["print_batch_step"] == 0:
+            logger.info(
+                f"{name} feature calculation process: [{idx}/{len(dataloader)}]"
+            )
+        if evaler.use_dali:
+            batch = [
+                paddle.to_tensor(batch[0]['data']),
+                paddle.to_tensor(batch[0]['label'])
+            ]
+        batch = [paddle.to_tensor(x) for x in batch]
+        batch[1] = batch[1].reshape([-1, 1]).astype("int64")
+        if len(batch) == 3:
+            has_unique_id = True
+            batch[2] = batch[2].reshape([-1, 1]).astype("int64")
+        out = evaler.model(batch[0], batch[1])
+        batch_feas = out["features"]
+
+        # do norm
+        if evaler.config["Global"].get("feature_normalize", True):
+            feas_norm = paddle.sqrt(
+                paddle.sum(paddle.square(batch_feas), axis=1, keepdim=True))
+            batch_feas = paddle.divide(batch_feas, feas_norm)
+            
+        # do binarize
+        if evaler.config["Global"].get("feature_binarize") == "round":
+            batch_feas = paddle.round(batch_feas).astype("float32") * 2.0 - 1.0
+
+        if evaler.config["Global"].get("feature_binarize") == "sign":
+            batch_feas = paddle.sign(batch_feas).astype("float32")
+
+        if all_feas is None:
+            all_feas = batch_feas
+            if has_unique_id:
+                all_unique_id = batch[2]
+            all_image_id = batch[1]
+        else:
+            all_feas = paddle.concat([all_feas, batch_feas])
+            all_image_id = paddle.concat([all_image_id, batch[1]])
+            if has_unique_id:
+                all_unique_id = paddle.concat([all_unique_id, batch[2]])
+                
+    if evaler.use_dali:
+        dataloader_tmp.reset()
+        
+    if paddle.distributed.get_world_size() > 1:
+        feat_list = []
+        img_id_list = []
+        unique_id_list = []
+        paddle.distributed.all_gather(feat_list, all_feas)
+        paddle.distributed.all_gather(img_id_list, all_image_id)
+        all_feas = paddle.concat(feat_list, axis=0)
+        all_image_id = paddle.concat(img_id_list, axis=0)
+        if has_unique_id:
+            paddle.distributed.all_gather(unique_id_list, all_unique_id)
+            all_unique_id = paddle.concat(unique_id_list, axis=0)
+
+    logger.info("Build {} done, all feat shape: {}, begin to eval..".format(
+        name, all_feas.shape))
+    return all_feas, all_image_id, all_unique_id
--- a/ppcls/engine/slim/init.py
+++ b/ppcls/engine/slim/init.py
--- a/ppcls/engine/train/init.py
+++ b/ppcls/engine/train/init.py
@ -0,0 +1,14 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+from ppcls.engine.train.train import train_epoch
--- a/ppcls/engine/train/train.py
+++ b/ppcls/engine/train/train.py
@ -0,0 +1,85 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+from __future__ import absolute_import, division, print_function
+
+import time
+import paddle
+from ppcls.engine.train.utils import update_loss, update_metric, log_info
+
+
+def train_epoch(trainer, epoch_id, print_batch_step):
+    tic = time.time()
+
+    train_dataloader = trainer.train_dataloader if trainer.use_dali else trainer.train_dataloader(
+    )
+    for iter_id, batch in enumerate(train_dataloader):
+        if iter_id >= trainer.max_iter:
+            break
+        if iter_id == 5:
+            for key in trainer.time_info:
+                trainer.time_info[key].reset()
+        trainer.time_info["reader_cost"].update(time.time() - tic)
+        if trainer.use_dali:
+            batch = [
+                paddle.to_tensor(batch[0]['data']),
+                paddle.to_tensor(batch[0]['label'])
+            ]
+        batch_size = batch[0].shape[0]
+        batch[1] = batch[1].reshape([-1, 1]).astype("int64")
+
+        trainer.global_step += 1
+        # image input
+        if trainer.amp:
+            with paddle.amp.auto_cast(custom_black_list={
+                    "flatten_contiguous_range", "greater_than"
+            }):
+                out = forward(trainer, batch)
+                loss_dict = trainer.train_loss_func(out, batch[1])
+        else:
+            out = forward(trainer, batch)
+
+        # calc loss
+        if trainer.config["DataLoader"]["Train"]["dataset"].get(
+                "batch_transform_ops", None):
+            loss_dict = trainer.train_loss_func(out, batch[1:])
+        else:
+            loss_dict = trainer.train_loss_func(out, batch[1])
+
+        # step opt and lr
+        if trainer.amp:
+            scaled = trainer.scaler.scale(loss_dict["loss"])
+            scaled.backward()
+            trainer.scaler.minimize(trainer.optimizer, scaled)
+        else:
+            loss_dict["loss"].backward()
+            trainer.optimizer.step()
+        trainer.optimizer.clear_grad()
+        trainer.lr_sch.step()
+
+        # below code just for logging
+        # update metric_for_logger
+        update_metric(trainer, out, batch, batch_size)
+        # update_loss_for_logger
+        update_loss(trainer, loss_dict, batch_size)
+        trainer.time_info["batch_cost"].update(time.time() - tic)
+        if iter_id % print_batch_step == 0:
+            log_info(trainer, batch_size, epoch_id, iter_id)
+        tic = time.time()
+
+
+def forward(trainer, batch):
+    if trainer.eval_mode == "classification":
+        return trainer.model(batch[0])
+    else:
+        return trainer.model(batch[0], batch[1])
--- a/ppcls/engine/train/utils.py
+++ b/ppcls/engine/train/utils.py
@ -0,0 +1,72 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+from __future__ import absolute_import, division, print_function
+
+import datetime
+from ppcls.utils import logger
+from ppcls.utils.misc import AverageMeter
+
+
+def update_metric(trainer, out, batch, batch_size):
+    # calc metric
+    if trainer.train_metric_func is not None:
+        metric_dict = trainer.train_metric_func(out, batch[-1])
+        for key in metric_dict:
+            if key not in trainer.output_info:
+                trainer.output_info[key] = AverageMeter(key, '7.5f')
+            trainer.output_info[key].update(metric_dict[key].numpy()[0],
+                                            batch_size)
+
+
+def update_loss(trainer, loss_dict, batch_size):
+    # update_output_info
+    for key in loss_dict:
+        if key not in trainer.output_info:
+            trainer.output_info[key] = AverageMeter(key, '7.5f')
+        trainer.output_info[key].update(loss_dict[key].numpy()[0], batch_size)
+
+
+def log_info(trainer, batch_size, epoch_id, iter_id):
+    lr_msg = "lr: {:.5f}".format(trainer.lr_sch.get_lr())
+    metric_msg = ", ".join([
+        "{}: {:.5f}".format(key, trainer.output_info[key].avg)
+        for key in trainer.output_info
+    ])
+    time_msg = "s, ".join([
+        "{}: {:.5f}".format(key, trainer.time_info[key].avg)
+        for key in trainer.time_info
+    ])
+
+    ips_msg = "ips: {:.5f} images/sec".format(
+        batch_size / trainer.time_info["batch_cost"].avg)
+    eta_sec = ((trainer.config["Global"]["epochs"] - epoch_id + 1
+                ) * len(trainer.train_dataloader) - iter_id
+               ) * trainer.time_info["batch_cost"].avg
+    eta_msg = "eta: {:s}".format(str(datetime.timedelta(seconds=int(eta_sec))))
+    logger.info("[Train][Epoch {}/{}][Iter: {}/{}]{}, {}, {}, {}, {}".format(
+        epoch_id, trainer.config["Global"]["epochs"], iter_id,
+        len(trainer.train_dataloader), lr_msg, metric_msg, time_msg, ips_msg,
+        eta_msg))
+
+    logger.scaler(
+        name="lr",
+        value=trainer.lr_sch.get_lr(),
+        step=trainer.global_step,
+        writer=trainer.vdl_writer)
+    for key in trainer.output_info:
+        logger.scaler(
+            name="train_{}".format(key),
+            value=trainer.output_info[key].avg,
+            step=trainer.global_step,
+            writer=trainer.vdl_writer)
--- a/ppcls/engine/trainer.py
+++ b/ppcls/engine/trainer.py
@ -1,662 +0,0 @@
-# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-from __future__ import absolute_import
-from __future__ import division
-from __future__ import print_function
-import os
-import sys
-import numpy as np
-
-__dir__ = os.path.dirname(os.path.abspath(__file__))
-sys.path.append(os.path.abspath(os.path.join(__dir__, '../../')))
-
-import time
-import platform
-import datetime
-import argparse
-import paddle
-import paddle.nn as nn
-import paddle.distributed as dist
-from visualdl import LogWriter
-
-from ppcls.utils.check import check_gpu
-from ppcls.utils.misc import AverageMeter
-from ppcls.utils import logger
-from ppcls.utils.logger import init_logger
-from ppcls.utils.config import print_config
-from ppcls.data import build_dataloader
-from ppcls.arch import build_model
-from ppcls.arch import apply_to_static
-from ppcls.loss import build_loss
-from ppcls.metric import build_metrics
-from ppcls.optimizer import build_optimizer
-from ppcls.utils.save_load import load_dygraph_pretrain, load_dygraph_pretrain_from_url
-from ppcls.utils.save_load import init_model
-from ppcls.utils import save_load
-
-from ppcls.data.utils.get_image_list import get_image_list
-from ppcls.data.postprocess import build_postprocess
-from ppcls.data import create_operators
-
-
-class Trainer(object):
-    def __init__(self, config, mode="train"):
-        self.mode = mode
-        self.config = config
-        self.output_dir = self.config['Global']['output_dir']
-
-        log_file = os.path.join(self.output_dir, self.config["Arch"]["name"],
-                                f"{mode}.log")
-        init_logger(name='root', log_file=log_file)
-        print_config(config)
-        # set device
-        assert self.config["Global"]["device"] in ["cpu", "gpu", "xpu"]
-        self.device = paddle.set_device(self.config["Global"]["device"])
-        # set dist
-        self.config["Global"][
-            "distributed"] = paddle.distributed.get_world_size() != 1
-        if self.config["Global"]["distributed"]:
-            dist.init_parallel_env()
-
-        if "Head" in self.config["Arch"]:
-            self.is_rec = True
-        else:
-            self.is_rec = False
-
-        self.model = build_model(self.config["Arch"])
-        # set @to_static for benchmark, skip this by default.
-        apply_to_static(self.config, self.model)
-
-        if self.config["Global"]["pretrained_model"] is not None:
-            if self.config["Global"]["pretrained_model"].startswith("http"):
-                load_dygraph_pretrain_from_url(
-                    self.model, self.config["Global"]["pretrained_model"])
-            else:
-                load_dygraph_pretrain(
-                    self.model, self.config["Global"]["pretrained_model"])
-
-        if self.config["Global"]["distributed"]:
-            self.model = paddle.DataParallel(self.model)
-
-        self.vdl_writer = None
-        if self.config['Global']['use_visualdl'] and mode == "train":
-            vdl_writer_path = os.path.join(self.output_dir, "vdl")
-            if not os.path.exists(vdl_writer_path):
-                os.makedirs(vdl_writer_path)
-            self.vdl_writer = LogWriter(logdir=vdl_writer_path)
-        logger.info('train with paddle {} and device {}'.format(
-            paddle.__version__, self.device))
-        # init members
-        self.train_dataloader = None
-        self.eval_dataloader = None
-        self.gallery_dataloader = None
-        self.query_dataloader = None
-        self.eval_mode = self.config["Global"].get("eval_mode",
-                                                   "classification")
-        self.amp = True if "AMP" in self.config else False
-        if self.amp and self.config["AMP"] is not None:
-            self.scale_loss = self.config["AMP"].get("scale_loss", 1.0)
-            self.use_dynamic_loss_scaling = self.config["AMP"].get(
-                "use_dynamic_loss_scaling", False)
-        else:
-            self.scale_loss = 1.0
-            self.use_dynamic_loss_scaling = False
-        if self.amp:
-            AMP_RELATED_FLAGS_SETTING = {
-                'FLAGS_cudnn_batchnorm_spatial_persistent': 1,
-                'FLAGS_max_inplace_grad_add': 8,
-            }
-            paddle.fluid.set_flags(AMP_RELATED_FLAGS_SETTING)
-        self.train_loss_func = None
-        self.eval_loss_func = None
-        self.train_metric_func = None
-        self.eval_metric_func = None
-        self.use_dali = self.config['Global'].get("use_dali", False)
-
-    def train(self):
-        # build train loss and metric info
-        if self.train_loss_func is None:
-            loss_info = self.config["Loss"]["Train"]
-            self.train_loss_func = build_loss(loss_info)
-        if self.train_metric_func is None:
-            metric_config = self.config.get("Metric")
-            if metric_config is not None:
-                metric_config = metric_config.get("Train")
-                if metric_config is not None:
-                    self.train_metric_func = build_metrics(metric_config)
-
-        if self.train_dataloader is None:
-            self.train_dataloader = build_dataloader(
-                self.config["DataLoader"], "Train", self.device, self.use_dali)
-
-        step_each_epoch = len(self.train_dataloader)
-
-        optimizer, lr_sch = build_optimizer(self.config["Optimizer"],
-                                            self.config["Global"]["epochs"],
-                                            step_each_epoch,
-                                            self.model.parameters())
-
-        print_batch_step = self.config['Global']['print_batch_step']
-        save_interval = self.config["Global"]["save_interval"]
-
-        best_metric = {
-            "metric": 0.0,
-            "epoch": 0,
-        }
-        # key:
-        # val: metrics list word
-        output_info = dict()
-        time_info = {
-            "batch_cost": AverageMeter(
-                "batch_cost", '.5f', postfix=" s,"),
-            "reader_cost": AverageMeter(
-                "reader_cost", ".5f", postfix=" s,"),
-        }
-        # global iter counter
-        global_step = 0
-
-        if self.config["Global"]["checkpoints"] is not None:
-            metric_info = init_model(self.config["Global"], self.model,
-                                     optimizer)
-            if metric_info is not None:
-                best_metric.update(metric_info)
-
-        # for amp training
-        if self.amp:
-            scaler = paddle.amp.GradScaler(
-                init_loss_scaling=self.scale_loss,
-                use_dynamic_loss_scaling=self.use_dynamic_loss_scaling)
-
-        tic = time.time()
-        max_iter = len(self.train_dataloader) - 1 if platform.system(
-        ) == "Windows" else len(self.train_dataloader)
-        for epoch_id in range(best_metric["epoch"] + 1,
-                              self.config["Global"]["epochs"] + 1):
-            acc = 0.0
-            train_dataloader = self.train_dataloader if self.use_dali else self.train_dataloader(
-            )
-            for iter_id, batch in enumerate(train_dataloader):
-                if iter_id >= max_iter:
-                    break
-                if iter_id == 5:
-                    for key in time_info:
-                        time_info[key].reset()
-                time_info["reader_cost"].update(time.time() - tic)
-                if self.use_dali:
-                    batch = [
-                        paddle.to_tensor(batch[0]['data']),
-                        paddle.to_tensor(batch[0]['label'])
-                    ]
-                batch_size = batch[0].shape[0]
-                batch[1] = batch[1].reshape([-1, 1]).astype("int64")
-
-                global_step += 1
-                # image input
-                if self.amp:
-                    with paddle.amp.auto_cast(custom_black_list={
-                            "flatten_contiguous_range", "greater_than"
-                    }):
-                        out = self.forward(batch)
-                        loss_dict = self.train_loss_func(out, batch[1])
-                else:
-                    out = self.forward(batch)
-
-                # calc loss
-                if self.config["DataLoader"]["Train"]["dataset"].get(
-                        "batch_transform_ops", None):
-                    loss_dict = self.train_loss_func(out, batch[1:])
-                else:
-                    loss_dict = self.train_loss_func(out, batch[1])
-
-                for key in loss_dict:
-                    if not key in output_info:
-                        output_info[key] = AverageMeter(key, '7.5f')
-                    output_info[key].update(loss_dict[key].numpy()[0],
-                                            batch_size)
-                # calc metric
-                if self.train_metric_func is not None:
-                    metric_dict = self.train_metric_func(out, batch[-1])
-                    for key in metric_dict:
-                        if not key in output_info:
-                            output_info[key] = AverageMeter(key, '7.5f')
-                        output_info[key].update(metric_dict[key].numpy()[0],
-                                                batch_size)
-
-                # step opt and lr
-                if self.amp:
-                    scaled = scaler.scale(loss_dict["loss"])
-                    scaled.backward()
-                    scaler.minimize(optimizer, scaled)
-                else:
-                    loss_dict["loss"].backward()
-                    optimizer.step()
-                optimizer.clear_grad()
-                lr_sch.step()
-
-                time_info["batch_cost"].update(time.time() - tic)
-
-                if iter_id % print_batch_step == 0:
-                    lr_msg = "lr: {:.5f}".format(lr_sch.get_lr())
-                    metric_msg = ", ".join([
-                        "{}: {:.5f}".format(key, output_info[key].avg)
-                        for key in output_info
-                    ])
-                    time_msg = "s, ".join([
-                        "{}: {:.5f}".format(key, time_info[key].avg)
-                        for key in time_info
-                    ])
-
-                    ips_msg = "ips: {:.5f} images/sec".format(
-                        batch_size / time_info["batch_cost"].avg)
-                    eta_sec = ((self.config["Global"]["epochs"] - epoch_id + 1
-                                ) * len(self.train_dataloader) - iter_id
-                               ) * time_info["batch_cost"].avg
-                    eta_msg = "eta: {:s}".format(
-                        str(datetime.timedelta(seconds=int(eta_sec))))
-                    logger.info(
-                        "[Train][Epoch {}/{}][Iter: {}/{}]{}, {}, {}, {}, {}".
-                        format(epoch_id, self.config["Global"][
-                            "epochs"], iter_id,
-                               len(self.train_dataloader), lr_msg, metric_msg,
-                               time_msg, ips_msg, eta_msg))
-
-                    logger.scaler(
-                        name="lr",
-                        value=lr_sch.get_lr(),
-                        step=global_step,
-                        writer=self.vdl_writer)
-                    for key in output_info:
-                        logger.scaler(
-                            name="train_{}".format(key),
-                            value=output_info[key].avg,
-                            step=global_step,
-                            writer=self.vdl_writer)
-                tic = time.time()
-            if self.use_dali:
-                self.train_dataloader.reset()
-            metric_msg = ", ".join([
-                "{}: {:.5f}".format(key, output_info[key].avg)
-                for key in output_info
-            ])
-            logger.info("[Train][Epoch {}/{}][Avg]{}".format(
-                epoch_id, self.config["Global"]["epochs"], metric_msg))
-            output_info.clear()
-
-            # eval model and save model if possible
-            if self.config["Global"][
-                    "eval_during_train"] and epoch_id % self.config["Global"][
-                        "eval_interval"] == 0:
-                acc = self.eval(epoch_id)
-                if acc > best_metric["metric"]:
-                    best_metric["metric"] = acc
-                    best_metric["epoch"] = epoch_id
-                    save_load.save_model(
-                        self.model,
-                        optimizer,
-                        best_metric,
-                        self.output_dir,
-                        model_name=self.config["Arch"]["name"],
-                        prefix="best_model")
-                logger.info("[Eval][Epoch {}][best metric: {}]".format(
-                    epoch_id, best_metric["metric"]))
-                logger.scaler(
-                    name="eval_acc",
-                    value=acc,
-                    step=epoch_id,
-                    writer=self.vdl_writer)
-
-                self.model.train()
-
-            # save model
-            if epoch_id % save_interval == 0:
-                save_load.save_model(
-                    self.model,
-                    optimizer, {"metric": acc,
-                                "epoch": epoch_id},
-                    self.output_dir,
-                    model_name=self.config["Arch"]["name"],
-                    prefix="epoch_{}".format(epoch_id))
-                # save the latest model
-                save_load.save_model(
-                    self.model,
-                    optimizer, {"metric": acc,
-                                "epoch": epoch_id},
-                    self.output_dir,
-                    model_name=self.config["Arch"]["name"],
-                    prefix="latest")
-
-        if self.vdl_writer is not None:
-            self.vdl_writer.close()
-
-    def build_avg_metrics(self, info_dict):
-        return {key: AverageMeter(key, '7.5f') for key in info_dict}
-
-    @paddle.no_grad()
-    def eval(self, epoch_id=0):
-        self.model.eval()
-        if self.eval_loss_func is None:
-            loss_config = self.config.get("Loss", None)
-            if loss_config is not None:
-                loss_config = loss_config.get("Eval")
-                if loss_config is not None:
-                    self.eval_loss_func = build_loss(loss_config)
-        if self.eval_mode == "classification":
-            if self.eval_dataloader is None:
-                self.eval_dataloader = build_dataloader(
-                    self.config["DataLoader"], "Eval", self.device,
-                    self.use_dali)
-
-            if self.eval_metric_func is None:
-                metric_config = self.config.get("Metric")
-                if metric_config is not None:
-                    metric_config = metric_config.get("Eval")
-                    if metric_config is not None:
-                        self.eval_metric_func = build_metrics(metric_config)
-
-            eval_result = self.eval_cls(epoch_id)
-
-        elif self.eval_mode == "retrieval":
-            if self.gallery_dataloader is None:
-                self.gallery_dataloader = build_dataloader(
-                    self.config["DataLoader"]["Eval"], "Gallery", self.device,
-                    self.use_dali)
-
-            if self.query_dataloader is None:
-                self.query_dataloader = build_dataloader(
-                    self.config["DataLoader"]["Eval"], "Query", self.device,
-                    self.use_dali)
-            # build metric info
-            if self.eval_metric_func is None:
-                metric_config = self.config.get("Metric", None)
-                if metric_config is None:
-                    metric_config = [{"name": "Recallk", "topk": (1, 5)}]
-                else:
-                    metric_config = metric_config["Eval"]
-                self.eval_metric_func = build_metrics(metric_config)
-            eval_result = self.eval_retrieval(epoch_id)
-        else:
-            logger.warning("Invalid eval mode: {}".format(self.eval_mode))
-            eval_result = None
-        self.model.train()
-        return eval_result
-
-    def forward(self, batch):
-        if not self.is_rec:
-            out = self.model(batch[0])
-        else:
-            out = self.model(batch[0], batch[1])
-        return out
-
-    @paddle.no_grad()
-    def eval_cls(self, epoch_id=0):
-        output_info = dict()
-        time_info = {
-            "batch_cost": AverageMeter(
-                "batch_cost", '.5f', postfix=" s,"),
-            "reader_cost": AverageMeter(
-                "reader_cost", ".5f", postfix=" s,"),
-        }
-        print_batch_step = self.config["Global"]["print_batch_step"]
-
-        metric_key = None
-        tic = time.time()
-        eval_dataloader = self.eval_dataloader if self.use_dali else self.eval_dataloader(
-        )
-        max_iter = len(self.eval_dataloader) - 1 if platform.system(
-        ) == "Windows" else len(self.eval_dataloader)
-        for iter_id, batch in enumerate(eval_dataloader):
-            if iter_id >= max_iter:
-                break
-            if iter_id == 5:
-                for key in time_info:
-                    time_info[key].reset()
-            if self.use_dali:
-                batch = [
-                    paddle.to_tensor(batch[0]['data']),
-                    paddle.to_tensor(batch[0]['label'])
-                ]
-            time_info["reader_cost"].update(time.time() - tic)
-            batch_size = batch[0].shape[0]
-            batch[0] = paddle.to_tensor(batch[0]).astype("float32")
-            batch[1] = batch[1].reshape([-1, 1]).astype("int64")
-            # image input
-            out = self.forward(batch)
-            # calc loss
-            if self.eval_loss_func is not None:
-                loss_dict = self.eval_loss_func(out, batch[-1])
-                for key in loss_dict:
-                    if not key in output_info:
-                        output_info[key] = AverageMeter(key, '7.5f')
-                    output_info[key].update(loss_dict[key].numpy()[0],
-                                            batch_size)
-            # calc metric
-            if self.eval_metric_func is not None:
-                metric_dict = self.eval_metric_func(out, batch[-1])
-                if paddle.distributed.get_world_size() > 1:
-                    for key in metric_dict:
-                        paddle.distributed.all_reduce(
-                            metric_dict[key],
-                            op=paddle.distributed.ReduceOp.SUM)
-                        metric_dict[key] = metric_dict[
-                            key] / paddle.distributed.get_world_size()
-                for key in metric_dict:
-                    if metric_key is None:
-                        metric_key = key
-                    if not key in output_info:
-                        output_info[key] = AverageMeter(key, '7.5f')
-
-                    output_info[key].update(metric_dict[key].numpy()[0],
-                                            batch_size)
-
-            time_info["batch_cost"].update(time.time() - tic)
-
-            if iter_id % print_batch_step == 0:
-                time_msg = "s, ".join([
-                    "{}: {:.5f}".format(key, time_info[key].avg)
-                    for key in time_info
-                ])
-
-                ips_msg = "ips: {:.5f} images/sec".format(
-                    batch_size / time_info["batch_cost"].avg)
-
-                metric_msg = ", ".join([
-                    "{}: {:.5f}".format(key, output_info[key].val)
-                    for key in output_info
-                ])
-                logger.info("[Eval][Epoch {}][Iter: {}/{}]{}, {}, {}".format(
-                    epoch_id, iter_id,
-                    len(self.eval_dataloader), metric_msg, time_msg, ips_msg))
-
-            tic = time.time()
-        if self.use_dali:
-            self.eval_dataloader.reset()
-        metric_msg = ", ".join([
-            "{}: {:.5f}".format(key, output_info[key].avg)
-            for key in output_info
-        ])
-        logger.info("[Eval][Epoch {}][Avg]{}".format(epoch_id, metric_msg))
-
-        # do not try to save best model
-        if self.eval_metric_func is None:
-            return -1
-        # return 1st metric in the dict
-        return output_info[metric_key].avg
-
-    def eval_retrieval(self, epoch_id=0):
-        self.model.eval()
-        # step1. build gallery
-        gallery_feas, gallery_img_id, gallery_unique_id = self._cal_feature(
-            name='gallery')
-        query_feas, query_img_id, query_query_id = self._cal_feature(
-            name='query')
-
-        # step2. do evaluation
-        sim_block_size = self.config["Global"].get("sim_block_size", 64)
-        sections = [sim_block_size] * (len(query_feas) // sim_block_size)
-        if len(query_feas) % sim_block_size:
-            sections.append(len(query_feas) % sim_block_size)
-        fea_blocks = paddle.split(query_feas, num_or_sections=sections)
-        if query_query_id is not None:
-            query_id_blocks = paddle.split(
-                query_query_id, num_or_sections=sections)
-        image_id_blocks = paddle.split(query_img_id, num_or_sections=sections)
-        metric_key = None
-
-        if self.eval_metric_func is None:
-            metric_dict = {metric_key: 0.}
-        else:
-            metric_dict = dict()
-            for block_idx, block_fea in enumerate(fea_blocks):
-                similarity_matrix = paddle.matmul(
-                    block_fea, gallery_feas, transpose_y=True)
-                if query_query_id is not None:
-                    query_id_block = query_id_blocks[block_idx]
-                    query_id_mask = (query_id_block != gallery_unique_id.t())
-
-                    image_id_block = image_id_blocks[block_idx]
-                    image_id_mask = (image_id_block != gallery_img_id.t())
-
-                    keep_mask = paddle.logical_or(query_id_mask, image_id_mask)
-                    similarity_matrix = similarity_matrix * keep_mask.astype(
-                        "float32")
-                else:
-                    keep_mask = None
-
-                metric_tmp = self.eval_metric_func(similarity_matrix,
-                                                   image_id_blocks[block_idx],
-                                                   gallery_img_id, keep_mask)
-
-                for key in metric_tmp:
-                    if key not in metric_dict:
-                        metric_dict[key] = metric_tmp[key] * block_fea.shape[
-                            0] / len(query_feas)
-                    else:
-                        metric_dict[key] += metric_tmp[key] * block_fea.shape[
-                            0] / len(query_feas)
-
-        metric_info_list = []
-        for key in metric_dict:
-            if metric_key is None:
-                metric_key = key
-            metric_info_list.append("{}: {:.5f}".format(key, metric_dict[key]))
-        metric_msg = ", ".join(metric_info_list)
-        logger.info("[Eval][Epoch {}][Avg]{}".format(epoch_id, metric_msg))
-
-        return metric_dict[metric_key]
-
-    def _cal_feature(self, name='gallery'):
-        all_feas = None
-        all_image_id = None
-        all_unique_id = None
-        if name == 'gallery':
-            dataloader = self.gallery_dataloader
-        elif name == 'query':
-            dataloader = self.query_dataloader
-        else:
-            raise RuntimeError("Only support gallery or query dataset")
-
-        has_unique_id = False
-        max_iter = len(dataloader) - 1 if platform.system(
-        ) == "Windows" else len(dataloader)
-        dataloader_tmp = dataloader if self.use_dali else dataloader()
-        for idx, batch in enumerate(
-                dataloader_tmp):  # load is very time-consuming
-            if idx >= max_iter:
-                break
-            if idx % self.config["Global"]["print_batch_step"] == 0:
-                logger.info(
-                    f"{name} feature calculation process: [{idx}/{len(dataloader)}]"
-                )
-            if self.use_dali:
-                batch = [
-                    paddle.to_tensor(batch[0]['data']),
-                    paddle.to_tensor(batch[0]['label'])
-                ]
-            batch = [paddle.to_tensor(x) for x in batch]
-            batch[1] = batch[1].reshape([-1, 1]).astype("int64")
-            if len(batch) == 3:
-                has_unique_id = True
-                batch[2] = batch[2].reshape([-1, 1]).astype("int64")
-            out = self.forward(batch)
-            batch_feas = out["features"]
-
-            # do norm
-            if self.config["Global"].get("feature_normalize", True):
-                feas_norm = paddle.sqrt(
-                    paddle.sum(paddle.square(batch_feas), axis=1,
-                               keepdim=True))
-                batch_feas = paddle.divide(batch_feas, feas_norm)
-
-            if all_feas is None:
-                all_feas = batch_feas
-                if has_unique_id:
-                    all_unique_id = batch[2]
-                all_image_id = batch[1]
-            else:
-                all_feas = paddle.concat([all_feas, batch_feas])
-                all_image_id = paddle.concat([all_image_id, batch[1]])
-                if has_unique_id:
-                    all_unique_id = paddle.concat([all_unique_id, batch[2]])
-        if self.use_dali:
-            dataloader_tmp.reset()
-        if paddle.distributed.get_world_size() > 1:
-            feat_list = []
-            img_id_list = []
-            unique_id_list = []
-            paddle.distributed.all_gather(feat_list, all_feas)
-            paddle.distributed.all_gather(img_id_list, all_image_id)
-            all_feas = paddle.concat(feat_list, axis=0)
-            all_image_id = paddle.concat(img_id_list, axis=0)
-            if has_unique_id:
-                paddle.distributed.all_gather(unique_id_list, all_unique_id)
-                all_unique_id = paddle.concat(unique_id_list, axis=0)
-
-        logger.info("Build {} done, all feat shape: {}, begin to eval..".
-                    format(name, all_feas.shape))
-        return all_feas, all_image_id, all_unique_id
-
-    @paddle.no_grad()
-    def infer(self, ):
-        total_trainer = paddle.distributed.get_world_size()
-        local_rank = paddle.distributed.get_rank()
-        image_list = get_image_list(self.config["Infer"]["infer_imgs"])
-        # data split
-        image_list = image_list[local_rank::total_trainer]
-
-        preprocess_func = create_operators(self.config["Infer"]["transforms"])
-        postprocess_func = build_postprocess(self.config["Infer"][
-            "PostProcess"])
-
-        batch_size = self.config["Infer"]["batch_size"]
-
-        self.model.eval()
-
-        batch_data = []
-        image_file_list = []
-        for idx, image_file in enumerate(image_list):
-            with open(image_file, 'rb') as f:
-                x = f.read()
-            for process in preprocess_func:
-                x = process(x)
-            batch_data.append(x)
-            image_file_list.append(image_file)
-            if len(batch_data) >= batch_size or idx == len(image_list) - 1:
-                batch_tensor = paddle.to_tensor(batch_data)
-                out = self.forward([batch_tensor])
-                if isinstance(out, list):
-                    out = out[0]
-                result = postprocess_func(out, image_file_list)
-                print(result)
-                batch_data.clear()
-                image_file_list.clear()
--- a/ppcls/loss/deephashloss.py
+++ b/ppcls/loss/deephashloss.py
@ -0,0 +1,90 @@
+#copyright (c) 2021 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+
+import paddle
+import paddle.nn as nn
+
+class DSHSDLoss(nn.Layer):
+    """
+    # DSHSD(IEEE ACCESS 2019)
+    # paper [Deep Supervised Hashing Based on Stable Distribution](https://ieeexplore.ieee.org/document/8648432/)
+    # [DSHSD] epoch:70,  bit:48,  dataset:cifar10-1,  MAP:0.809, Best MAP: 0.809
+    # [DSHSD] epoch:250, bit:48,  dataset:nuswide_21, MAP:0.809, Best MAP: 0.815
+    # [DSHSD] epoch:135, bit:48,  dataset:imagenet,   MAP:0.647, Best MAP: 0.647
+    """
+    def __init__(self, n_class, bit, alpha, multi_label=False):
+        super(DSHSDLoss, self).__init__()
+        self.m = 2 * bit     
+        self.alpha = alpha
+        self.multi_label = multi_label
+        self.n_class = n_class
+        self.fc = paddle.nn.Linear(bit, n_class, bias_attr=False)
+
+    def forward(self, input, label):        
+        feature = input["features"]
+        feature = feature.tanh().astype("float32")
+
+        dist = paddle.sum(
+                    paddle.square((paddle.unsqueeze(feature, 1) - paddle.unsqueeze(feature, 0))), 
+                    axis=2)
+        
+        # label to ont-hot
+        label = paddle.flatten(label)
+        label = paddle.nn.functional.one_hot(label,  self.n_class).astype("float32")
+
+        s = (paddle.matmul(label, label, transpose_y=True) == 0).astype("float32")
+        Ld = (1 - s) / 2 * dist + s / 2 * (self.m - dist).clip(min=0)
+        Ld = Ld.mean()
+        
+        logits = self.fc(feature)
+        if self.multi_label:
+            # multiple labels classification loss
+            Lc = (logits - label * logits + ((1 + (-logits).exp()).log())).sum(axis=1).mean()
+        else:
+            # single labels classification loss
+            Lc = (-paddle.nn.functional.softmax(logits).log() * label).sum(axis=1).mean()
+
+        return {"dshsdloss": Lc + Ld * self.alpha}
+
+class LCDSHLoss(nn.Layer):
+    """
+    # paper [Locality-Constrained Deep Supervised Hashing for Image Retrieval](https://www.ijcai.org/Proceedings/2017/0499.pdf)
+    # [LCDSH] epoch:145, bit:48, dataset:cifar10-1,  MAP:0.798, Best MAP: 0.798
+    # [LCDSH] epoch:183, bit:48, dataset:nuswide_21, MAP:0.833, Best MAP: 0.834
+    """
+    def __init__(self, n_class, _lambda):
+        super(LCDSHLoss, self).__init__()
+        self._lambda = _lambda
+        self.n_class = n_class
+
+    def forward(self, input, label):
+        feature = input["features"]
+
+        # label to ont-hot
+        label = paddle.flatten(label)
+        label = paddle.nn.functional.one_hot(label,  self.n_class).astype("float32")
+        
+        s = 2 * (paddle.matmul(label, label, transpose_y=True) > 0).astype("float32") - 1
+        inner_product = paddle.matmul(feature, feature, transpose_y=True) * 0.5
+
+        inner_product = inner_product.clip(min=-50, max=50)
+        L1 = paddle.log(1 + paddle.exp(-s * inner_product)).mean()
+
+        b = feature.sign()
+        inner_product_ = paddle.matmul(b, b, transpose_y=True) * 0.5
+        sigmoid = paddle.nn.Sigmoid()
+        L2 = (sigmoid(inner_product) - sigmoid(inner_product_)).pow(2).mean()
+
+        return {"lcdshloss": L1 + self._lambda * L2}
+
--- a/ppcls/metric/init.py
+++ b/ppcls/metric/init.py
@ -16,7 +16,7 @@ from paddle import nn
 import copy
 from collections import OrderedDict

-from .metrics import TopkAcc, mAP, mINP, Recallk
+from .metrics import TopkAcc, mAP, mINP, Recallk, Precisionk
 from .metrics import DistillationTopkAcc
 from .metrics import GoogLeNetTopkAcc

--- a/ppcls/metric/metrics.py
+++ b/ppcls/metric/metrics.py
@ -168,6 +168,47 @@ class Recallk(nn.Layer):
        return metric_dict


+class Precisionk(nn.Layer):
+    def __init__(self, topk=(1, 5)):
+        super().__init__()
+        assert isinstance(topk, (int, list, tuple))
+        if isinstance(topk, int):
+            topk = [topk]
+        self.topk = topk
+
+    def forward(self, similarities_matrix, query_img_id, gallery_img_id,
+                keep_mask):
+        metric_dict = dict()
+
+        #get cmc
+        choosen_indices = paddle.argsort(
+            similarities_matrix, axis=1, descending=True)
+        gallery_labels_transpose = paddle.transpose(gallery_img_id, [1, 0])
+        gallery_labels_transpose = paddle.broadcast_to(
+            gallery_labels_transpose,
+            shape=[
+                choosen_indices.shape[0], gallery_labels_transpose.shape[1]
+            ])
+        choosen_label = paddle.index_sample(gallery_labels_transpose,
+                                            choosen_indices)
+        equal_flag = paddle.equal(choosen_label, query_img_id)
+        if keep_mask is not None:
+            keep_mask = paddle.index_sample(
+                keep_mask.astype('float32'), choosen_indices)
+            equal_flag = paddle.logical_and(equal_flag,
+                                            keep_mask.astype('bool'))
+        equal_flag = paddle.cast(equal_flag, 'float32')
+        
+        Ns = paddle.arange(gallery_img_id.shape[0]) + 1
+        equal_flag_cumsum = paddle.cumsum(equal_flag, axis=1)
+        Precision_at_k = (paddle.mean(equal_flag_cumsum, axis=0) / Ns).numpy()
+
+        for k in self.topk:
+            metric_dict["precision@{}".format(k)] = Precision_at_k[k - 1]
+
+        return metric_dict
+
+
 class DistillationTopkAcc(TopkAcc):
    def __init__(self, model_key, feature_key=None, topk=(1, 5)):
        super().__init__(topk=topk)
--- a/tools/eval.py
+++ b/tools/eval.py
@ -21,11 +21,11 @@ __dir__ = os.path.dirname(os.path.abspath(__file__))
 sys.path.append(os.path.abspath(os.path.join(__dir__, '../')))

 from ppcls.utils import config
-from ppcls.engine.trainer import Trainer
+from ppcls.engine.engine import Engine

 if __name__ == "__main__":
    args = config.parse_args()
    config = config.get_config(
        args.config, overrides=args.override, show=False)
-    trainer = Trainer(config, mode="eval")
-    trainer.eval()
+    engine = Engine(config, mode="eval")
+    engine.eval()
--- a/tools/export_model.py
+++ b/tools/export_model.py
@ -24,82 +24,11 @@ import paddle
 import paddle.nn as nn

 from ppcls.utils import config
-from ppcls.utils.logger import init_logger
-from ppcls.utils.config import print_config
-from ppcls.arch import build_model, RecModel, DistillationModel
-from ppcls.utils.save_load import load_dygraph_pretrain
-from ppcls.arch.gears.identity_head import IdentityHead
-
-
-class ExportModel(nn.Layer):
-    """
-    ExportModel: add softmax onto the model
-    """
-
-    def __init__(self, config):
-        super().__init__()
-        self.base_model = build_model(config)
-
-        # we should choose a final model to export
-        if isinstance(self.base_model, DistillationModel):
-            self.infer_model_name = config["infer_model_name"]
-        else:
-            self.infer_model_name = None
-
-        self.infer_output_key = config.get("infer_output_key", None)
-        if self.infer_output_key == "features" and isinstance(self.base_model,
-                                                              RecModel):
-            self.base_model.head = IdentityHead()
-        if config.get("infer_add_softmax", True):
-            self.softmax = nn.Softmax(axis=-1)
-        else:
-            self.softmax = None
-
-    def eval(self):
-        self.training = False
-        for layer in self.sublayers():
-            layer.training = False
-            layer.eval()
-
-    def forward(self, x):
-        x = self.base_model(x)
-        if isinstance(x, list):
-            x = x[0]
-        if self.infer_model_name is not None:
-            x = x[self.infer_model_name]
-        if self.infer_output_key is not None:
-            x = x[self.infer_output_key]
-        if self.softmax is not None:
-            x = self.softmax(x)
-        return x
-
+from ppcls.engine.engine import Engine

 if __name__ == "__main__":
    args = config.parse_args()
    config = config.get_config(
        args.config, overrides=args.override, show=False)
-    log_file = os.path.join(config['Global']['output_dir'],
-                            config["Arch"]["name"], "export.log")
-    init_logger(name='root', log_file=log_file)
-    print_config(config)
-
-    # set device
-    assert config["Global"]["device"] in ["cpu", "gpu", "xpu"]
-    device = paddle.set_device(config["Global"]["device"])
-    model = ExportModel(config["Arch"])
-    if config["Global"]["pretrained_model"] is not None:
-        load_dygraph_pretrain(model.base_model,
-                              config["Global"]["pretrained_model"])
-
-    model.eval()
-
-    model = paddle.jit.to_static(
-        model,
-        input_spec=[
-            paddle.static.InputSpec(
-                shape=[None] + config["Global"]["image_shape"],
-                dtype='float32')
-        ])
-    paddle.jit.save(model,
-                    os.path.join(config["Global"]["save_inference_dir"],
-                                 "inference"))
+    engine = Engine(config, mode="export")
+    engine.export()
--- a/tools/infer.py
+++ b/tools/infer.py
@ -21,12 +21,11 @@ __dir__ = os.path.dirname(os.path.abspath(__file__))
 sys.path.append(os.path.abspath(os.path.join(__dir__, '../')))

 from ppcls.utils import config
-from ppcls.engine.trainer import Trainer
+from ppcls.engine.engine import Engine

 if __name__ == "__main__":
    args = config.parse_args()
    config = config.get_config(
        args.config, overrides=args.override, show=False)
-    trainer = Trainer(config, mode="infer")
-
-    trainer.infer()
+    engine = Engine(config, mode="infer")
+    engine.infer()
--- a/tools/train.py
+++ b/tools/train.py
@ -21,11 +21,11 @@ __dir__ = os.path.dirname(os.path.abspath(__file__))
 sys.path.append(os.path.abspath(os.path.join(__dir__, '../')))

 from ppcls.utils import config
-from ppcls.engine.trainer import Trainer
+from ppcls.engine.engine import Engine

 if __name__ == "__main__":
    args = config.parse_args()
    config = config.get_config(
        args.config, overrides=args.override, show=False)
-    trainer = Trainer(config, mode="train")
-    trainer.train()
+    engine = Engine(config, mode="train")
+    engine.train()