Merge branch 'PaddlePaddle:develop' into develop

pull/1193/head
lilithzhou 2021-09-01 15:10:14 +08:00 committed by GitHub
commit 1210665a8b
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
30 changed files with 1414 additions and 879 deletions

View File

@ -7,29 +7,30 @@ VisualDL, a visualization analysis tool of PaddlePaddle, provides a variety of c
Now PaddleClas support use VisualDL to visualize the changes of learning rate, loss, accuracy in training.
### Set config and start training
You only need to set the `vdl_dir` field in train config:
You only need to set the field `Global.use_visualdl` to `True` in train config:
```yaml
# config.yaml
vdl_dir: "./vdl.log"
Global:
...
use_visualdl: True
...
```
`vdl_dir`: Specify the directory where VisualDL stores logs.
Then normal start training:
PaddleClas will save the VisualDL logs to subdirectory `vdl/` under the output directory specified by `Global.output_dir`. And then you just need to start training normally:
```shell
python3 tools/train.py -c config.yaml
```
### Start VisualDL
After starting the training program, you can start the VisualDL service in the new terminal session:
After starting the training program, you can start the VisualDL service in a new terminal session:
```shell
visualdl --logdir ./vdl.log
visualdl --logdir ./output/vdl/
```
In the above command, `--logdir` specify the logs directory. VisualDL will traverse and iterate to find the subdirectories of the specified directory to visualize all the experimental results. You can also use the following parameters to set the IP and port number of the VisualDL service:
In the above command, `--logdir` specify the directory of the VisualDL logs produced in training. VisualDL will traverse and iterate to find the subdirectories of the specified directory to visualize all the experimental results. You can also use the following parameters to set the IP and port number of the VisualDL service:
* `--host`ip, default is 127.0.0.1
* `--port`port, default is 8040

View File

@ -23,7 +23,7 @@ Among them, `-c` is used to specify the path of the configuration file, `-o` is
`-o use_gpu=True` means to use GPU for training. If you want to use the CPU for training, you need to set `use_gpu` to `False`.
Of course, you can also directly modify the configuration file to update the configuration. For specific configuration parameters, please refer to [Configuration Document](config_en.md).
Of course, you can also directly modify the configuration file to update the configuration. For specific configuration parameters, please refer to [Configuration Document](config_description_en.md).
* The output log examples are as follows:
* If mixup or cutmix is used in training, top-1 and top-k (default by 5) will not be printed in the log:

View File

@ -7,15 +7,17 @@ VisualDL是飞桨可视化分析工具以丰富的图表呈现训练参数变
现在PaddleClas支持在训练阶段使用VisualDL查看训练过程中学习率learning rate、损失值loss以及准确率accuracy的变化情况。
### 设置config文件并启动训练
在PaddleClas中使用VisualDL只需在训练配置文件config文件添加如下字段
在PaddleClas中使用VisualDL只需在训练配置文件config文件中设置字段 `Global.use_visualdl``True`
```yaml
# config.yaml
vdl_dir: "./vdl.log"
Global:
...
use_visualdl: True
...
```
`vdl_dir` 用于指定VisualDL用于保存log信息的目录。
然后正常启动训练即可:
PaddleClas 会将 VisualDL 的日志保存在 `Global.output_dir` 字段指定目录下的 `vdl/` 子目录下,然后正常启动训练即可:
```shell
python3 tools/train.py -c config.yaml
@ -25,10 +27,10 @@ python3 tools/train.py -c config.yaml
在启动训练程序后可以在新的终端session中启动VisualDL服务
```shell
visualdl --logdir ./vdl.log
visualdl --logdir ./output/vdl/
```
上述命令中,参数`--logdir`用于指定日志目录VisualDL将遍历并且迭代寻找指定目录的子目录将所有实验结果进行可视化。也同样可以使用下述参数设定VisualDL服务的ip及端口号
上述命令中,参数`--logdir`用于指定保存 VisualDL 日志目录VisualDL将遍历并且迭代寻找指定目录的子目录将所有实验结果进行可视化。也同样可以使用下述参数设定VisualDL服务的ip及端口号
* `--host`设定IP默认为127.0.0.1
* `--port`设定端口默认为8040

View File

@ -30,7 +30,7 @@ python3 tools/train.py \
其中,`-c`用于指定配置文件的路径,`-o`用于指定需要修改或者添加的参数,其中`-o Arch.pretrained=False`表示不使用预训练模型,`-o Global.device=gpu`表示使用GPU进行训练。如果希望使用CPU进行训练则需要将`Global.device`设置为`cpu`。
更详细的训练配置,也可以直接修改模型对应的配置文件。具体配置参数参考[配置文档](config.md)。
更详细的训练配置,也可以直接修改模型对应的配置文件。具体配置参数参考[配置文档](config_description.md)。
运行上述命令,可以看到输出日志,示例如下:

View File

@ -41,10 +41,15 @@ class _SysPathG(object):
self.path)
with _SysPathG(
os.path.join(
os.path.dirname(os.path.abspath(__file__)), 'ppcls', 'arch')):
import backbone
with _SysPathG(os.path.dirname(os.path.abspath(__file__)), ):
import ppcls
import ppcls.arch.backbone as backbone
def ppclas_init():
if ppcls.utils.logger._logger is None:
ppcls.utils.logger.init_logger()
ppclas_init()
def _load_pretrained_parameters(model, name):
url = 'https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/{}_pretrained.pdparams'.format(
@ -63,9 +68,8 @@ with _SysPathG(
Returns:
model: nn.Layer. Specific `AlexNet` model depends on args.
"""
kwargs.update({'pretrained': pretrained})
model = backbone.AlexNet(**kwargs)
if pretrained:
model = _load_pretrained_parameters(model, 'AlexNet')
return model
@ -80,9 +84,8 @@ with _SysPathG(
Returns:
model: nn.Layer. Specific `VGG11` model depends on args.
"""
kwargs.update({'pretrained': pretrained})
model = backbone.VGG11(**kwargs)
if pretrained:
model = _load_pretrained_parameters(model, 'VGG11')
return model
@ -97,9 +100,8 @@ with _SysPathG(
Returns:
model: nn.Layer. Specific `VGG13` model depends on args.
"""
kwargs.update({'pretrained': pretrained})
model = backbone.VGG13(**kwargs)
if pretrained:
model = _load_pretrained_parameters(model, 'VGG13')
return model
@ -114,9 +116,8 @@ with _SysPathG(
Returns:
model: nn.Layer. Specific `VGG16` model depends on args.
"""
kwargs.update({'pretrained': pretrained})
model = backbone.VGG16(**kwargs)
if pretrained:
model = _load_pretrained_parameters(model, 'VGG16')
return model
@ -131,9 +132,8 @@ with _SysPathG(
Returns:
model: nn.Layer. Specific `VGG19` model depends on args.
"""
kwargs.update({'pretrained': pretrained})
model = backbone.VGG19(**kwargs)
if pretrained:
model = _load_pretrained_parameters(model, 'VGG19')
return model
@ -149,9 +149,8 @@ with _SysPathG(
Returns:
model: nn.Layer. Specific `ResNet18` model depends on args.
"""
kwargs.update({'pretrained': pretrained})
model = backbone.ResNet18(**kwargs)
if pretrained:
model = _load_pretrained_parameters(model, 'ResNet18')
return model
@ -167,9 +166,8 @@ with _SysPathG(
Returns:
model: nn.Layer. Specific `ResNet34` model depends on args.
"""
kwargs.update({'pretrained': pretrained})
model = backbone.ResNet34(**kwargs)
if pretrained:
model = _load_pretrained_parameters(model, 'ResNet34')
return model
@ -185,9 +183,8 @@ with _SysPathG(
Returns:
model: nn.Layer. Specific `ResNet50` model depends on args.
"""
kwargs.update({'pretrained': pretrained})
model = backbone.ResNet50(**kwargs)
if pretrained:
model = _load_pretrained_parameters(model, 'ResNet50')
return model
@ -203,9 +200,8 @@ with _SysPathG(
Returns:
model: nn.Layer. Specific `ResNet101` model depends on args.
"""
kwargs.update({'pretrained': pretrained})
model = backbone.ResNet101(**kwargs)
if pretrained:
model = _load_pretrained_parameters(model, 'ResNet101')
return model
@ -221,9 +217,8 @@ with _SysPathG(
Returns:
model: nn.Layer. Specific `ResNet152` model depends on args.
"""
kwargs.update({'pretrained': pretrained})
model = backbone.ResNet152(**kwargs)
if pretrained:
model = _load_pretrained_parameters(model, 'ResNet152')
return model
@ -237,9 +232,8 @@ with _SysPathG(
Returns:
model: nn.Layer. Specific `SqueezeNet1_0` model depends on args.
"""
kwargs.update({'pretrained': pretrained})
model = backbone.SqueezeNet1_0(**kwargs)
if pretrained:
model = _load_pretrained_parameters(model, 'SqueezeNet1_0')
return model
@ -253,9 +247,8 @@ with _SysPathG(
Returns:
model: nn.Layer. Specific `SqueezeNet1_1` model depends on args.
"""
kwargs.update({'pretrained': pretrained})
model = backbone.SqueezeNet1_1(**kwargs)
if pretrained:
model = _load_pretrained_parameters(model, 'SqueezeNet1_1')
return model
@ -271,9 +264,8 @@ with _SysPathG(
Returns:
model: nn.Layer. Specific `DenseNet121` model depends on args.
"""
kwargs.update({'pretrained': pretrained})
model = backbone.DenseNet121(**kwargs)
if pretrained:
model = _load_pretrained_parameters(model, 'DenseNet121')
return model
@ -289,9 +281,8 @@ with _SysPathG(
Returns:
model: nn.Layer. Specific `DenseNet161` model depends on args.
"""
kwargs.update({'pretrained': pretrained})
model = backbone.DenseNet161(**kwargs)
if pretrained:
model = _load_pretrained_parameters(model, 'DenseNet161')
return model
@ -307,9 +298,8 @@ with _SysPathG(
Returns:
model: nn.Layer. Specific `DenseNet169` model depends on args.
"""
kwargs.update({'pretrained': pretrained})
model = backbone.DenseNet169(**kwargs)
if pretrained:
model = _load_pretrained_parameters(model, 'DenseNet169')
return model
@ -325,9 +315,8 @@ with _SysPathG(
Returns:
model: nn.Layer. Specific `DenseNet201` model depends on args.
"""
kwargs.update({'pretrained': pretrained})
model = backbone.DenseNet201(**kwargs)
if pretrained:
model = _load_pretrained_parameters(model, 'DenseNet201')
return model
@ -343,9 +332,8 @@ with _SysPathG(
Returns:
model: nn.Layer. Specific `DenseNet264` model depends on args.
"""
kwargs.update({'pretrained': pretrained})
model = backbone.DenseNet264(**kwargs)
if pretrained:
model = _load_pretrained_parameters(model, 'DenseNet264')
return model
@ -359,9 +347,8 @@ with _SysPathG(
Returns:
model: nn.Layer. Specific `InceptionV3` model depends on args.
"""
kwargs.update({'pretrained': pretrained})
model = backbone.InceptionV3(**kwargs)
if pretrained:
model = _load_pretrained_parameters(model, 'InceptionV3')
return model
@ -375,9 +362,8 @@ with _SysPathG(
Returns:
model: nn.Layer. Specific `InceptionV4` model depends on args.
"""
kwargs.update({'pretrained': pretrained})
model = backbone.InceptionV4(**kwargs)
if pretrained:
model = _load_pretrained_parameters(model, 'InceptionV4')
return model
@ -391,9 +377,8 @@ with _SysPathG(
Returns:
model: nn.Layer. Specific `GoogLeNet` model depends on args.
"""
kwargs.update({'pretrained': pretrained})
model = backbone.GoogLeNet(**kwargs)
if pretrained:
model = _load_pretrained_parameters(model, 'GoogLeNet')
return model
@ -407,9 +392,8 @@ with _SysPathG(
Returns:
model: nn.Layer. Specific `ShuffleNetV2_x0_25` model depends on args.
"""
kwargs.update({'pretrained': pretrained})
model = backbone.ShuffleNetV2_x0_25(**kwargs)
if pretrained:
model = _load_pretrained_parameters(model, 'ShuffleNetV2_x0_25')
return model
@ -423,9 +407,8 @@ with _SysPathG(
Returns:
model: nn.Layer. Specific `MobileNetV1` model depends on args.
"""
kwargs.update({'pretrained': pretrained})
model = backbone.MobileNetV1(**kwargs)
if pretrained:
model = _load_pretrained_parameters(model, 'MobileNetV1')
return model
@ -439,9 +422,8 @@ with _SysPathG(
Returns:
model: nn.Layer. Specific `MobileNetV1_x0_25` model depends on args.
"""
kwargs.update({'pretrained': pretrained})
model = backbone.MobileNetV1_x0_25(**kwargs)
if pretrained:
model = _load_pretrained_parameters(model, 'MobileNetV1_x0_25')
return model
@ -455,9 +437,8 @@ with _SysPathG(
Returns:
model: nn.Layer. Specific `MobileNetV1_x0_5` model depends on args.
"""
kwargs.update({'pretrained': pretrained})
model = backbone.MobileNetV1_x0_5(**kwargs)
if pretrained:
model = _load_pretrained_parameters(model, 'MobileNetV1_x0_5')
return model
@ -471,9 +452,8 @@ with _SysPathG(
Returns:
model: nn.Layer. Specific `MobileNetV1_x0_75` model depends on args.
"""
kwargs.update({'pretrained': pretrained})
model = backbone.MobileNetV1_x0_75(**kwargs)
if pretrained:
model = _load_pretrained_parameters(model, 'MobileNetV1_x0_75')
return model
@ -487,9 +467,8 @@ with _SysPathG(
Returns:
model: nn.Layer. Specific `MobileNetV2_x0_25` model depends on args.
"""
kwargs.update({'pretrained': pretrained})
model = backbone.MobileNetV2_x0_25(**kwargs)
if pretrained:
model = _load_pretrained_parameters(model, 'MobileNetV2_x0_25')
return model
@ -503,9 +482,8 @@ with _SysPathG(
Returns:
model: nn.Layer. Specific `MobileNetV2_x0_5` model depends on args.
"""
kwargs.update({'pretrained': pretrained})
model = backbone.MobileNetV2_x0_5(**kwargs)
if pretrained:
model = _load_pretrained_parameters(model, 'MobileNetV2_x0_5')
return model
@ -519,9 +497,8 @@ with _SysPathG(
Returns:
model: nn.Layer. Specific `MobileNetV2_x0_75` model depends on args.
"""
kwargs.update({'pretrained': pretrained})
model = backbone.MobileNetV2_x0_75(**kwargs)
if pretrained:
model = _load_pretrained_parameters(model, 'MobileNetV2_x0_75')
return model
@ -535,9 +512,8 @@ with _SysPathG(
Returns:
model: nn.Layer. Specific `MobileNetV2_x1_5` model depends on args.
"""
kwargs.update({'pretrained': pretrained})
model = backbone.MobileNetV2_x1_5(**kwargs)
if pretrained:
model = _load_pretrained_parameters(model, 'MobileNetV2_x1_5')
return model
@ -551,9 +527,8 @@ with _SysPathG(
Returns:
model: nn.Layer. Specific `MobileNetV2_x2_0` model depends on args.
"""
kwargs.update({'pretrained': pretrained})
model = backbone.MobileNetV2_x2_0(**kwargs)
if pretrained:
model = _load_pretrained_parameters(model, 'MobileNetV2_x2_0')
return model
@ -567,10 +542,8 @@ with _SysPathG(
Returns:
model: nn.Layer. Specific `MobileNetV3_large_x0_35` model depends on args.
"""
kwargs.update({'pretrained': pretrained})
model = backbone.MobileNetV3_large_x0_35(**kwargs)
if pretrained:
model = _load_pretrained_parameters(model,
'MobileNetV3_large_x0_35')
return model
@ -584,10 +557,8 @@ with _SysPathG(
Returns:
model: nn.Layer. Specific `MobileNetV3_large_x0_5` model depends on args.
"""
kwargs.update({'pretrained': pretrained})
model = backbone.MobileNetV3_large_x0_5(**kwargs)
if pretrained:
model = _load_pretrained_parameters(model,
'MobileNetV3_large_x0_5')
return model
@ -601,10 +572,8 @@ with _SysPathG(
Returns:
model: nn.Layer. Specific `MobileNetV3_large_x0_75` model depends on args.
"""
kwargs.update({'pretrained': pretrained})
model = backbone.MobileNetV3_large_x0_75(**kwargs)
if pretrained:
model = _load_pretrained_parameters(model,
'MobileNetV3_large_x0_75')
return model
@ -618,10 +587,8 @@ with _SysPathG(
Returns:
model: nn.Layer. Specific `MobileNetV3_large_x1_0` model depends on args.
"""
kwargs.update({'pretrained': pretrained})
model = backbone.MobileNetV3_large_x1_0(**kwargs)
if pretrained:
model = _load_pretrained_parameters(model,
'MobileNetV3_large_x1_0')
return model
@ -635,10 +602,8 @@ with _SysPathG(
Returns:
model: nn.Layer. Specific `MobileNetV3_large_x1_25` model depends on args.
"""
kwargs.update({'pretrained': pretrained})
model = backbone.MobileNetV3_large_x1_25(**kwargs)
if pretrained:
model = _load_pretrained_parameters(model,
'MobileNetV3_large_x1_25')
return model
@ -652,10 +617,8 @@ with _SysPathG(
Returns:
model: nn.Layer. Specific `MobileNetV3_small_x0_35` model depends on args.
"""
kwargs.update({'pretrained': pretrained})
model = backbone.MobileNetV3_small_x0_35(**kwargs)
if pretrained:
model = _load_pretrained_parameters(model,
'MobileNetV3_small_x0_35')
return model
@ -669,10 +632,8 @@ with _SysPathG(
Returns:
model: nn.Layer. Specific `MobileNetV3_small_x0_5` model depends on args.
"""
kwargs.update({'pretrained': pretrained})
model = backbone.MobileNetV3_small_x0_5(**kwargs)
if pretrained:
model = _load_pretrained_parameters(model,
'MobileNetV3_small_x0_5')
return model
@ -686,10 +647,8 @@ with _SysPathG(
Returns:
model: nn.Layer. Specific `MobileNetV3_small_x0_75` model depends on args.
"""
kwargs.update({'pretrained': pretrained})
model = backbone.MobileNetV3_small_x0_75(**kwargs)
if pretrained:
model = _load_pretrained_parameters(model,
'MobileNetV3_small_x0_75')
return model
@ -703,10 +662,8 @@ with _SysPathG(
Returns:
model: nn.Layer. Specific `MobileNetV3_small_x1_0` model depends on args.
"""
kwargs.update({'pretrained': pretrained})
model = backbone.MobileNetV3_small_x1_0(**kwargs)
if pretrained:
model = _load_pretrained_parameters(model,
'MobileNetV3_small_x1_0')
return model
@ -720,10 +677,8 @@ with _SysPathG(
Returns:
model: nn.Layer. Specific `MobileNetV3_small_x1_25` model depends on args.
"""
kwargs.update({'pretrained': pretrained})
model = backbone.MobileNetV3_small_x1_25(**kwargs)
if pretrained:
model = _load_pretrained_parameters(model,
'MobileNetV3_small_x1_25')
return model
@ -737,9 +692,8 @@ with _SysPathG(
Returns:
model: nn.Layer. Specific `ResNeXt101_32x4d` model depends on args.
"""
kwargs.update({'pretrained': pretrained})
model = backbone.ResNeXt101_32x4d(**kwargs)
if pretrained:
model = _load_pretrained_parameters(model, 'ResNeXt101_32x4d')
return model
@ -753,9 +707,8 @@ with _SysPathG(
Returns:
model: nn.Layer. Specific `ResNeXt101_64x4d` model depends on args.
"""
kwargs.update({'pretrained': pretrained})
model = backbone.ResNeXt101_64x4d(**kwargs)
if pretrained:
model = _load_pretrained_parameters(model, 'ResNeXt101_64x4d')
return model
@ -769,9 +722,8 @@ with _SysPathG(
Returns:
model: nn.Layer. Specific `ResNeXt152_32x4d` model depends on args.
"""
kwargs.update({'pretrained': pretrained})
model = backbone.ResNeXt152_32x4d(**kwargs)
if pretrained:
model = _load_pretrained_parameters(model, 'ResNeXt152_32x4d')
return model
@ -785,9 +737,8 @@ with _SysPathG(
Returns:
model: nn.Layer. Specific `ResNeXt152_64x4d` model depends on args.
"""
kwargs.update({'pretrained': pretrained})
model = backbone.ResNeXt152_64x4d(**kwargs)
if pretrained:
model = _load_pretrained_parameters(model, 'ResNeXt152_64x4d')
return model
@ -801,9 +752,8 @@ with _SysPathG(
Returns:
model: nn.Layer. Specific `ResNeXt50_32x4d` model depends on args.
"""
kwargs.update({'pretrained': pretrained})
model = backbone.ResNeXt50_32x4d(**kwargs)
if pretrained:
model = _load_pretrained_parameters(model, 'ResNeXt50_32x4d')
return model
@ -817,9 +767,8 @@ with _SysPathG(
Returns:
model: nn.Layer. Specific `ResNeXt50_64x4d` model depends on args.
"""
kwargs.update({'pretrained': pretrained})
model = backbone.ResNeXt50_64x4d(**kwargs)
if pretrained:
model = _load_pretrained_parameters(model, 'ResNeXt50_64x4d')
return model
@ -833,8 +782,7 @@ with _SysPathG(
Returns:
model: nn.Layer. Specific `ResNeXt50_64x4d` model depends on args.
"""
kwargs.update({'pretrained': pretrained})
model = backbone.DarkNet53(**kwargs)
if pretrained:
model = _load_pretrained_parameters(model, 'DarkNet53')
return model

View File

@ -58,6 +58,7 @@ from ppcls.arch.backbone.model_zoo.rednet import RedNet26, RedNet38, RedNet50, R
from ppcls.arch.backbone.model_zoo.tnt import TNT_small
from ppcls.arch.backbone.model_zoo.hardnet import HarDNet68, HarDNet85, HarDNet39_ds, HarDNet68_ds
from ppcls.arch.backbone.variant_models.resnet_variant import ResNet50_last_stage_stride1
from ppcls.arch.backbone.variant_models.vgg_variant import VGG19Sigmoid
def get_apis():

View File

@ -33,9 +33,9 @@ MODEL_URLS = {
"SwinTransformer_base_patch4_window12_384":
"https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/SwinTransformer_base_patch4_window12_384_pretrained.pdparams",
"SwinTransformer_large_patch4_window7_224":
"https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/SwinTransformer_large_patch4_window7_224_pretrained.pdparams",
"https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/SwinTransformer_large_patch4_window7_224_22kto1k_pretrained.pdparams",
"SwinTransformer_large_patch4_window12_384":
"https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/SwinTransformer_large_patch4_window12_384_pretrained.pdparams",
"https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/SwinTransformer_large_patch4_window12_384_22kto1k_pretrained.pdparams",
}
__all__ = list(MODEL_URLS.keys())

View File

@ -1 +1,2 @@
from .resnet_variant import ResNet50_last_stage_stride1
from .vgg_variant import VGG19Sigmoid

View File

@ -0,0 +1,28 @@
import paddle
from paddle.nn import Sigmoid
from ppcls.arch.backbone.legendary_models.vgg import VGG19
__all__ = ["VGG19Sigmoid"]
class SigmoidSuffix(paddle.nn.Layer):
def __init__(self, origin_layer):
super(SigmoidSuffix, self).__init__()
self.origin_layer = origin_layer
self.sigmoid = Sigmoid()
def forward(self, input, res_dict=None, **kwargs):
x = self.origin_layer(input)
x = self.sigmoid(x)
return x
def VGG19Sigmoid(pretrained=False, use_ssld=False, **kwargs):
def replace_function(origin_layer):
new_layer = SigmoidSuffix(origin_layer)
return new_layer
match_re = "linear_2"
model = VGG19(pretrained=pretrained, use_ssld=use_ssld, **kwargs)
model.replace_sub(match_re, replace_function, True)
return model

View File

@ -28,7 +28,7 @@ class CircleMargin(nn.Layer):
weight_attr = paddle.ParamAttr(
initializer=paddle.nn.initializer.XavierNormal())
self.fc0 = paddle.nn.Linear(
self.fc = paddle.nn.Linear(
self.embedding_size, self.class_num, weight_attr=weight_attr)
def forward(self, input, label):
@ -36,19 +36,22 @@ class CircleMargin(nn.Layer):
paddle.sum(paddle.square(input), axis=1, keepdim=True))
input = paddle.divide(input, feat_norm)
weight = self.fc0.weight
weight = self.fc.weight
weight_norm = paddle.sqrt(
paddle.sum(paddle.square(weight), axis=0, keepdim=True))
weight = paddle.divide(weight, weight_norm)
logits = paddle.matmul(input, weight)
if not self.training or label is None:
return logits
alpha_p = paddle.clip(-logits.detach() + 1 + self.margin, min=0.)
alpha_n = paddle.clip(logits.detach() + self.margin, min=0.)
delta_p = 1 - self.margin
delta_n = self.margin
index = paddle.fluid.layers.where(label != -1).reshape([-1])
m_hot = F.one_hot(label.reshape([-1]), num_classes=logits.shape[1])
logits_p = alpha_p * (logits - delta_p)
logits_n = alpha_n * (logits - delta_n)
pre_logits = logits_p * m_hot + logits_n * (1 - m_hot)

View File

@ -46,6 +46,9 @@ class CosMargin(paddle.nn.Layer):
weight = paddle.divide(weight, weight_norm)
cos = paddle.matmul(input, weight)
if not self.training or label is None:
return cos
cos_m = cos - self.margin
one_hot = paddle.nn.functional.one_hot(label, self.class_num)

View File

@ -0,0 +1,149 @@
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output_dlbhc/
device: gpu
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 100
#eval_mode: "retrieval"
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: ./inference
#feature postprocess
feature_normalize: False
feature_binarize: "round"
# model architecture
Arch:
name: "RecModel"
Backbone:
name: "MobileNetV3_large_x1_0"
pretrained: True
class_num: 512
Head:
name: "FC"
class_num: 50030
embedding_size: 512
infer_output_key: "features"
infer_add_softmax: "false"
# loss function config for train/eval process
Loss:
Train:
- CELoss:
weight: 1.0
epsilon: 0.1
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Piecewise
learning_rate: 0.1
decay_epochs: [50, 150]
values: [0.1, 0.01, 0.001]
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/Aliproduct/
cls_label_path: ./dataset/Aliproduct/train_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
size: 256
- RandCropImage:
size: 227
- RandFlipImage:
flip_code: 1
- NormalizeImage:
scale: 1.0/255.0
mean: [0.4914, 0.4822, 0.4465]
std: [0.2023, 0.1994, 0.2010]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 128
drop_last: False
shuffle: True
loader:
num_workers: 4
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: ./dataset/Aliproduct/
cls_label_path: ./dataset/Aliproduct/val_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
size: 227
- NormalizeImage:
scale: 1.0/255.0
mean: [0.4914, 0.4822, 0.4465]
std: [0.2023, 0.1994, 0.2010]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 256
drop_last: False
shuffle: False
loader:
num_workers: 4
use_shared_memory: True
Infer:
infer_imgs: docs/images/whl/demo.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 227
- NormalizeImage:
scale: 1.0/255.0
mean: [0.4914, 0.4822, 0.4465]
std: [0.2023, 0.1994, 0.2010]
order: ''
- ToCHWImage:
PostProcess:
name: Topk
topk: 5
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval:
- TopkAcc:
topk: [1, 5]
# switch to metric below when eval by retrieval
# - Recallk:
# topk: [1]
# - mAP:
# - Precisionk:
# topk: [1]

View File

@ -0,0 +1,147 @@
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output
device: gpu
save_interval: 1
eval_during_train: True
eval_interval: 1
eval_mode: "retrieval"
epochs: 128
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: ./inference
#feature postprocess
feature_normalize: False
feature_binarize: "round"
# model architecture
Arch:
name: "RecModel"
Backbone:
name: "VGG19Sigmoid"
pretrained: True
class_num: 48
Head:
name: "FC"
class_num: 10
embedding_size: 48
infer_output_key: "features"
infer_add_softmax: "false"
# loss function config for train/eval process
Loss:
Train:
- CELoss:
weight: 1.0
epsilon: 0.1
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Piecewise
learning_rate: 0.01
decay_epochs: [200]
values: [0.01, 0.001]
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/cifar10/
cls_label_path: ./dataset/cifar10/cifar10-2/train.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
size: 256
- RandCropImage:
size: 224
- RandFlipImage:
flip_code: 1
- NormalizeImage:
scale: 1.0/255.0
mean: [0.4914, 0.4822, 0.4465]
std: [0.2023, 0.1994, 0.2010]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 128
drop_last: False
shuffle: True
loader:
num_workers: 4
use_shared_memory: True
Eval:
Query:
dataset:
name: ImageNetDataset
image_root: ./dataset/cifar10/
cls_label_path: ./dataset/cifar10/cifar10-2/test.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.4914, 0.4822, 0.4465]
std: [0.2023, 0.1994, 0.2010]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 512
drop_last: False
shuffle: False
loader:
num_workers: 4
use_shared_memory: True
Gallery:
dataset:
name: ImageNetDataset
image_root: ./dataset/cifar10/
cls_label_path: ./dataset/cifar10/cifar10-2/database.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.4914, 0.4822, 0.4465]
std: [0.2023, 0.1994, 0.2010]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 512
drop_last: False
shuffle: False
loader:
num_workers: 4
use_shared_memory: True
Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval:
- mAP:
- Precisionk:
topk: [1, 5]

View File

@ -42,9 +42,9 @@ class RandomErasing(object):
h = int(round(math.sqrt(target_area * aspect_ratio)))
w = int(round(math.sqrt(target_area / aspect_ratio)))
if w < img.shape[2] and h < img.shape[1]:
x1 = random.randint(0, img.shape[1] - h)
y1 = random.randint(0, img.shape[2] - w)
if w < img.shape[1] and h < img.shape[0]:
x1 = random.randint(0, img.shape[0] - h)
y1 = random.randint(0, img.shape[1] - w)
if img.shape[0] == 3:
img[x1:x1 + h, y1:y1 + w, 0] = self.mean[0]
img[x1:x1 + h, y1:y1 + w, 1] = self.mean[1]

View File

@ -0,0 +1,391 @@
# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import os
import platform
import paddle
import paddle.distributed as dist
from visualdl import LogWriter
from paddle import nn
from ppcls.utils.check import check_gpu
from ppcls.utils.misc import AverageMeter
from ppcls.utils import logger
from ppcls.utils.logger import init_logger
from ppcls.utils.config import print_config
from ppcls.data import build_dataloader
from ppcls.arch import build_model, RecModel, DistillationModel
from ppcls.arch import apply_to_static
from ppcls.loss import build_loss
from ppcls.metric import build_metrics
from ppcls.optimizer import build_optimizer
from ppcls.utils.save_load import load_dygraph_pretrain, load_dygraph_pretrain_from_url
from ppcls.utils.save_load import init_model
from ppcls.utils import save_load
from ppcls.data.utils.get_image_list import get_image_list
from ppcls.data.postprocess import build_postprocess
from ppcls.data import create_operators
from ppcls.engine.train import train_epoch
from ppcls.engine import evaluation
from ppcls.arch.gears.identity_head import IdentityHead
class Engine(object):
def __init__(self, config, mode="train"):
assert mode in ["train", "eval", "infer", "export"]
self.mode = mode
self.config = config
self.eval_mode = self.config["Global"].get("eval_mode",
"classification")
# init logger
self.output_dir = self.config['Global']['output_dir']
log_file = os.path.join(self.output_dir, self.config["Arch"]["name"],
f"{mode}.log")
init_logger(name='root', log_file=log_file)
print_config(config)
# init train_func and eval_func
assert self.eval_mode in ["classification", "retrieval"], logger.error(
"Invalid eval mode: {}".format(self.eval_mode))
self.train_epoch_func = train_epoch
self.eval_func = getattr(evaluation, self.eval_mode + "_eval")
self.use_dali = self.config['Global'].get("use_dali", False)
# for visualdl
self.vdl_writer = None
if self.config['Global']['use_visualdl'] and mode == "train":
vdl_writer_path = os.path.join(self.output_dir, "vdl")
if not os.path.exists(vdl_writer_path):
os.makedirs(vdl_writer_path)
self.vdl_writer = LogWriter(logdir=vdl_writer_path)
# set device
assert self.config["Global"]["device"] in ["cpu", "gpu", "xpu"]
self.device = paddle.set_device(self.config["Global"]["device"])
logger.info('train with paddle {} and device {}'.format(
paddle.__version__, self.device))
# AMP training
self.amp = True if "AMP" in self.config else False
if self.amp and self.config["AMP"] is not None:
self.scale_loss = self.config["AMP"].get("scale_loss", 1.0)
self.use_dynamic_loss_scaling = self.config["AMP"].get(
"use_dynamic_loss_scaling", False)
else:
self.scale_loss = 1.0
self.use_dynamic_loss_scaling = False
if self.amp:
AMP_RELATED_FLAGS_SETTING = {
'FLAGS_cudnn_batchnorm_spatial_persistent': 1,
'FLAGS_max_inplace_grad_add': 8,
}
paddle.fluid.set_flags(AMP_RELATED_FLAGS_SETTING)
# build dataloader
if self.mode == 'train':
self.train_dataloader = build_dataloader(
self.config["DataLoader"], "Train", self.device, self.use_dali)
if self.mode in ["train", "eval"]:
if self.eval_mode == "classification":
self.eval_dataloader = build_dataloader(
self.config["DataLoader"], "Eval", self.device,
self.use_dali)
elif self.eval_mode == "retrieval":
self.gallery_dataloader = build_dataloader(
self.config["DataLoader"]["Eval"], "Gallery", self.device,
self.use_dali)
self.query_dataloader = build_dataloader(
self.config["DataLoader"]["Eval"], "Query", self.device,
self.use_dali)
# build loss
if self.mode == "train":
loss_info = self.config["Loss"]["Train"]
self.train_loss_func = build_loss(loss_info)
if self.mode in ["train", "eval"]:
loss_config = self.config.get("Loss", None)
if loss_config is not None:
loss_config = loss_config.get("Eval")
if loss_config is not None:
self.eval_loss_func = build_loss(loss_config)
else:
self.eval_loss_func = None
else:
self.eval_loss_func = None
# build metric
if self.mode == 'train':
metric_config = self.config.get("Metric")
if metric_config is not None:
metric_config = metric_config.get("Train")
if metric_config is not None:
self.train_metric_func = build_metrics(metric_config)
else:
self.train_metric_func = None
else:
self.train_metric_func = None
if self.mode in ["train", "eval"]:
metric_config = self.config.get("Metric")
if self.eval_mode == "classification":
if metric_config is not None:
metric_config = metric_config.get("Eval")
if metric_config is not None:
self.eval_metric_func = build_metrics(metric_config)
elif self.eval_mode == "retrieval":
if metric_config is None:
metric_config = [{"name": "Recallk", "topk": (1, 5)}]
else:
metric_config = metric_config["Eval"]
self.eval_metric_func = build_metrics(metric_config)
else:
self.eval_metric_func = None
# build model
self.model = build_model(self.config["Arch"])
# set @to_static for benchmark, skip this by default.
apply_to_static(self.config, self.model)
# load_pretrain
if self.config["Global"]["pretrained_model"] is not None:
if self.config["Global"]["pretrained_model"].startswith("http"):
load_dygraph_pretrain_from_url(
self.model, self.config["Global"]["pretrained_model"])
else:
load_dygraph_pretrain(
self.model, self.config["Global"]["pretrained_model"])
# for slim
# build optimizer
if self.mode == 'train':
self.optimizer, self.lr_sch = build_optimizer(
self.config["Optimizer"], self.config["Global"]["epochs"],
len(self.train_dataloader), self.model.parameters())
# for distributed
self.config["Global"][
"distributed"] = paddle.distributed.get_world_size() != 1
if self.config["Global"]["distributed"]:
dist.init_parallel_env()
if self.config["Global"]["distributed"]:
self.model = paddle.DataParallel(self.model)
# build postprocess for infer
if self.mode == 'infer':
self.preprocess_func = create_operators(self.config["Infer"][
"transforms"])
self.postprocess_func = build_postprocess(self.config["Infer"][
"PostProcess"])
def train(self):
assert self.mode == "train"
print_batch_step = self.config['Global']['print_batch_step']
save_interval = self.config["Global"]["save_interval"]
best_metric = {
"metric": 0.0,
"epoch": 0,
}
# key:
# val: metrics list word
self.output_info = dict()
self.time_info = {
"batch_cost": AverageMeter(
"batch_cost", '.5f', postfix=" s,"),
"reader_cost": AverageMeter(
"reader_cost", ".5f", postfix=" s,"),
}
# global iter counter
self.global_step = 0
if self.config["Global"]["checkpoints"] is not None:
metric_info = init_model(self.config["Global"], self.model,
self.optimizer)
if metric_info is not None:
best_metric.update(metric_info)
# for amp training
if self.amp:
self.scaler = paddle.amp.GradScaler(
init_loss_scaling=self.scale_loss,
use_dynamic_loss_scaling=self.use_dynamic_loss_scaling)
self.max_iter = len(self.train_dataloader) - 1 if platform.system(
) == "Windows" else len(self.train_dataloader)
for epoch_id in range(best_metric["epoch"] + 1,
self.config["Global"]["epochs"] + 1):
acc = 0.0
# for one epoch train
self.train_epoch_func(self, epoch_id, print_batch_step)
if self.use_dali:
self.train_dataloader.reset()
metric_msg = ", ".join([
"{}: {:.5f}".format(key, self.output_info[key].avg)
for key in self.output_info
])
logger.info("[Train][Epoch {}/{}][Avg]{}".format(
epoch_id, self.config["Global"]["epochs"], metric_msg))
self.output_info.clear()
# eval model and save model if possible
if self.config["Global"][
"eval_during_train"] and epoch_id % self.config["Global"][
"eval_interval"] == 0:
acc = self.eval(epoch_id)
if acc > best_metric["metric"]:
best_metric["metric"] = acc
best_metric["epoch"] = epoch_id
save_load.save_model(
self.model,
self.optimizer,
best_metric,
self.output_dir,
model_name=self.config["Arch"]["name"],
prefix="best_model")
logger.info("[Eval][Epoch {}][best metric: {}]".format(
epoch_id, best_metric["metric"]))
logger.scaler(
name="eval_acc",
value=acc,
step=epoch_id,
writer=self.vdl_writer)
self.model.train()
# save model
if epoch_id % save_interval == 0:
save_load.save_model(
self.model,
self.optimizer, {"metric": acc,
"epoch": epoch_id},
self.output_dir,
model_name=self.config["Arch"]["name"],
prefix="epoch_{}".format(epoch_id))
# save the latest model
save_load.save_model(
self.model,
self.optimizer, {"metric": acc,
"epoch": epoch_id},
self.output_dir,
model_name=self.config["Arch"]["name"],
prefix="latest")
if self.vdl_writer is not None:
self.vdl_writer.close()
@paddle.no_grad()
def eval(self, epoch_id=0):
assert self.mode in ["train", "eval"]
self.model.eval()
eval_result = self.eval_func(self, epoch_id)
self.model.train()
return eval_result
@paddle.no_grad()
def infer(self):
assert self.mode == "infer" and self.eval_mode == "classification"
total_trainer = paddle.distributed.get_world_size()
local_rank = paddle.distributed.get_rank()
image_list = get_image_list(self.config["Infer"]["infer_imgs"])
# data split
image_list = image_list[local_rank::total_trainer]
batch_size = self.config["Infer"]["batch_size"]
self.model.eval()
batch_data = []
image_file_list = []
for idx, image_file in enumerate(image_list):
with open(image_file, 'rb') as f:
x = f.read()
for process in self.preprocess_func:
x = process(x)
batch_data.append(x)
image_file_list.append(image_file)
if len(batch_data) >= batch_size or idx == len(image_list) - 1:
batch_tensor = paddle.to_tensor(batch_data)
out = self.model(batch_tensor)
if isinstance(out, list):
out = out[0]
result = self.postprocess_func(out, image_file_list)
print(result)
batch_data.clear()
image_file_list.clear()
def export(self):
assert self.mode == "export"
model = ExportModel(self.config["Arch"], self.model)
if self.config["Global"]["pretrained_model"] is not None:
load_dygraph_pretrain(model.base_model,
self.config["Global"]["pretrained_model"])
model.eval()
model = paddle.jit.to_static(
model,
input_spec=[
paddle.static.InputSpec(
shape=[None] + self.config["Global"]["image_shape"],
dtype='float32')
])
paddle.jit.save(
model,
os.path.join(self.config["Global"]["save_inference_dir"],
"inference"))
class ExportModel(nn.Layer):
"""
ExportModel: add softmax onto the model
"""
def __init__(self, config, model):
super().__init__()
self.base_model = model
# we should choose a final model to export
if isinstance(self.base_model, DistillationModel):
self.infer_model_name = config["infer_model_name"]
else:
self.infer_model_name = None
self.infer_output_key = config.get("infer_output_key", None)
if self.infer_output_key == "features" and isinstance(self.base_model,
RecModel):
self.base_model.head = IdentityHead()
if config.get("infer_add_softmax", True):
self.softmax = nn.Softmax(axis=-1)
else:
self.softmax = None
def eval(self):
self.training = False
for layer in self.sublayers():
layer.training = False
layer.eval()
def forward(self, x):
x = self.base_model(x)
if isinstance(x, list):
x = x[0]
if self.infer_model_name is not None:
x = x[self.infer_model_name]
if self.infer_output_key is not None:
x = x[self.infer_output_key]
if self.softmax is not None:
x = self.softmax(x)
return x

View File

@ -0,0 +1,16 @@
# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from ppcls.engine.evaluation.classification import classification_eval
from ppcls.engine.evaluation.retrieval import retrieval_eval

View File

@ -0,0 +1,114 @@
# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import time
import platform
import paddle
from ppcls.utils.misc import AverageMeter
from ppcls.utils import logger
def classification_eval(evaler, epoch_id=0):
output_info = dict()
time_info = {
"batch_cost": AverageMeter(
"batch_cost", '.5f', postfix=" s,"),
"reader_cost": AverageMeter(
"reader_cost", ".5f", postfix=" s,"),
}
print_batch_step = evaler.config["Global"]["print_batch_step"]
metric_key = None
tic = time.time()
eval_dataloader = evaler.eval_dataloader if evaler.use_dali else evaler.eval_dataloader(
)
max_iter = len(evaler.eval_dataloader) - 1 if platform.system(
) == "Windows" else len(evaler.eval_dataloader)
for iter_id, batch in enumerate(eval_dataloader):
if iter_id >= max_iter:
break
if iter_id == 5:
for key in time_info:
time_info[key].reset()
if evaler.use_dali:
batch = [
paddle.to_tensor(batch[0]['data']),
paddle.to_tensor(batch[0]['label'])
]
time_info["reader_cost"].update(time.time() - tic)
batch_size = batch[0].shape[0]
batch[0] = paddle.to_tensor(batch[0]).astype("float32")
batch[1] = batch[1].reshape([-1, 1]).astype("int64")
# image input
out = evaler.model(batch[0])
# calc loss
if evaler.eval_loss_func is not None:
loss_dict = evaler.eval_loss_func(out, batch[1])
for key in loss_dict:
if key not in output_info:
output_info[key] = AverageMeter(key, '7.5f')
output_info[key].update(loss_dict[key].numpy()[0], batch_size)
# calc metric
if evaler.eval_metric_func is not None:
metric_dict = evaler.eval_metric_func(out, batch[1])
if paddle.distributed.get_world_size() > 1:
for key in metric_dict:
paddle.distributed.all_reduce(
metric_dict[key], op=paddle.distributed.ReduceOp.SUM)
metric_dict[key] = metric_dict[
key] / paddle.distributed.get_world_size()
for key in metric_dict:
if metric_key is None:
metric_key = key
if key not in output_info:
output_info[key] = AverageMeter(key, '7.5f')
output_info[key].update(metric_dict[key].numpy()[0],
batch_size)
time_info["batch_cost"].update(time.time() - tic)
if iter_id % print_batch_step == 0:
time_msg = "s, ".join([
"{}: {:.5f}".format(key, time_info[key].avg)
for key in time_info
])
ips_msg = "ips: {:.5f} images/sec".format(
batch_size / time_info["batch_cost"].avg)
metric_msg = ", ".join([
"{}: {:.5f}".format(key, output_info[key].val)
for key in output_info
])
logger.info("[Eval][Epoch {}][Iter: {}/{}]{}, {}, {}".format(
epoch_id, iter_id,
len(evaler.eval_dataloader), metric_msg, time_msg, ips_msg))
tic = time.time()
if evaler.use_dali:
evaler.eval_dataloader.reset()
metric_msg = ", ".join([
"{}: {:.5f}".format(key, output_info[key].avg) for key in output_info
])
logger.info("[Eval][Epoch {}][Avg]{}".format(epoch_id, metric_msg))
# do not try to save best eval.model
if evaler.eval_metric_func is None:
return -1
# return 1st metric in the dict
return output_info[metric_key].avg

View File

@ -0,0 +1,163 @@
# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import platform
import paddle
from ppcls.utils import logger
def retrieval_eval(evaler, epoch_id=0):
evaler.model.eval()
# step1. build gallery
gallery_feas, gallery_img_id, gallery_unique_id = cal_feature(
evaler, name='gallery')
query_feas, query_img_id, query_query_id = cal_feature(
evaler, name='query')
# step2. do evaluation
sim_block_size = evaler.config["Global"].get("sim_block_size", 64)
sections = [sim_block_size] * (len(query_feas) // sim_block_size)
if len(query_feas) % sim_block_size:
sections.append(len(query_feas) % sim_block_size)
fea_blocks = paddle.split(query_feas, num_or_sections=sections)
if query_query_id is not None:
query_id_blocks = paddle.split(
query_query_id, num_or_sections=sections)
image_id_blocks = paddle.split(query_img_id, num_or_sections=sections)
metric_key = None
if evaler.eval_loss_func is None:
metric_dict = {metric_key: 0.}
else:
metric_dict = dict()
for block_idx, block_fea in enumerate(fea_blocks):
similarity_matrix = paddle.matmul(
block_fea, gallery_feas, transpose_y=True)
if query_query_id is not None:
query_id_block = query_id_blocks[block_idx]
query_id_mask = (query_id_block != gallery_unique_id.t())
image_id_block = image_id_blocks[block_idx]
image_id_mask = (image_id_block != gallery_img_id.t())
keep_mask = paddle.logical_or(query_id_mask, image_id_mask)
similarity_matrix = similarity_matrix * keep_mask.astype(
"float32")
else:
keep_mask = None
metric_tmp = evaler.eval_metric_func(similarity_matrix,
image_id_blocks[block_idx],
gallery_img_id, keep_mask)
for key in metric_tmp:
if key not in metric_dict:
metric_dict[key] = metric_tmp[key] * block_fea.shape[
0] / len(query_feas)
else:
metric_dict[key] += metric_tmp[key] * block_fea.shape[
0] / len(query_feas)
metric_info_list = []
for key in metric_dict:
if metric_key is None:
metric_key = key
metric_info_list.append("{}: {:.5f}".format(key, metric_dict[key]))
metric_msg = ", ".join(metric_info_list)
logger.info("[Eval][Epoch {}][Avg]{}".format(epoch_id, metric_msg))
return metric_dict[metric_key]
def cal_feature(evaler, name='gallery'):
all_feas = None
all_image_id = None
all_unique_id = None
has_unique_id = False
if name == 'gallery':
dataloader = evaler.gallery_dataloader
elif name == 'query':
dataloader = evaler.query_dataloader
else:
raise RuntimeError("Only support gallery or query dataset")
max_iter = len(dataloader) - 1 if platform.system() == "Windows" else len(
dataloader)
dataloader_tmp = dataloader if evaler.use_dali else dataloader()
for idx, batch in enumerate(dataloader_tmp): # load is very time-consuming
if idx >= max_iter:
break
if idx % evaler.config["Global"]["print_batch_step"] == 0:
logger.info(
f"{name} feature calculation process: [{idx}/{len(dataloader)}]"
)
if evaler.use_dali:
batch = [
paddle.to_tensor(batch[0]['data']),
paddle.to_tensor(batch[0]['label'])
]
batch = [paddle.to_tensor(x) for x in batch]
batch[1] = batch[1].reshape([-1, 1]).astype("int64")
if len(batch) == 3:
has_unique_id = True
batch[2] = batch[2].reshape([-1, 1]).astype("int64")
out = evaler.model(batch[0], batch[1])
batch_feas = out["features"]
# do norm
if evaler.config["Global"].get("feature_normalize", True):
feas_norm = paddle.sqrt(
paddle.sum(paddle.square(batch_feas), axis=1, keepdim=True))
batch_feas = paddle.divide(batch_feas, feas_norm)
# do binarize
if evaler.config["Global"].get("feature_binarize") == "round":
batch_feas = paddle.round(batch_feas).astype("float32") * 2.0 - 1.0
if evaler.config["Global"].get("feature_binarize") == "sign":
batch_feas = paddle.sign(batch_feas).astype("float32")
if all_feas is None:
all_feas = batch_feas
if has_unique_id:
all_unique_id = batch[2]
all_image_id = batch[1]
else:
all_feas = paddle.concat([all_feas, batch_feas])
all_image_id = paddle.concat([all_image_id, batch[1]])
if has_unique_id:
all_unique_id = paddle.concat([all_unique_id, batch[2]])
if evaler.use_dali:
dataloader_tmp.reset()
if paddle.distributed.get_world_size() > 1:
feat_list = []
img_id_list = []
unique_id_list = []
paddle.distributed.all_gather(feat_list, all_feas)
paddle.distributed.all_gather(img_id_list, all_image_id)
all_feas = paddle.concat(feat_list, axis=0)
all_image_id = paddle.concat(img_id_list, axis=0)
if has_unique_id:
paddle.distributed.all_gather(unique_id_list, all_unique_id)
all_unique_id = paddle.concat(unique_id_list, axis=0)
logger.info("Build {} done, all feat shape: {}, begin to eval..".format(
name, all_feas.shape))
return all_feas, all_image_id, all_unique_id

View File

@ -0,0 +1,14 @@
# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from ppcls.engine.train.train import train_epoch

View File

@ -0,0 +1,85 @@
# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from __future__ import absolute_import, division, print_function
import time
import paddle
from ppcls.engine.train.utils import update_loss, update_metric, log_info
def train_epoch(trainer, epoch_id, print_batch_step):
tic = time.time()
train_dataloader = trainer.train_dataloader if trainer.use_dali else trainer.train_dataloader(
)
for iter_id, batch in enumerate(train_dataloader):
if iter_id >= trainer.max_iter:
break
if iter_id == 5:
for key in trainer.time_info:
trainer.time_info[key].reset()
trainer.time_info["reader_cost"].update(time.time() - tic)
if trainer.use_dali:
batch = [
paddle.to_tensor(batch[0]['data']),
paddle.to_tensor(batch[0]['label'])
]
batch_size = batch[0].shape[0]
batch[1] = batch[1].reshape([-1, 1]).astype("int64")
trainer.global_step += 1
# image input
if trainer.amp:
with paddle.amp.auto_cast(custom_black_list={
"flatten_contiguous_range", "greater_than"
}):
out = forward(trainer, batch)
loss_dict = trainer.train_loss_func(out, batch[1])
else:
out = forward(trainer, batch)
# calc loss
if trainer.config["DataLoader"]["Train"]["dataset"].get(
"batch_transform_ops", None):
loss_dict = trainer.train_loss_func(out, batch[1:])
else:
loss_dict = trainer.train_loss_func(out, batch[1])
# step opt and lr
if trainer.amp:
scaled = trainer.scaler.scale(loss_dict["loss"])
scaled.backward()
trainer.scaler.minimize(trainer.optimizer, scaled)
else:
loss_dict["loss"].backward()
trainer.optimizer.step()
trainer.optimizer.clear_grad()
trainer.lr_sch.step()
# below code just for logging
# update metric_for_logger
update_metric(trainer, out, batch, batch_size)
# update_loss_for_logger
update_loss(trainer, loss_dict, batch_size)
trainer.time_info["batch_cost"].update(time.time() - tic)
if iter_id % print_batch_step == 0:
log_info(trainer, batch_size, epoch_id, iter_id)
tic = time.time()
def forward(trainer, batch):
if trainer.eval_mode == "classification":
return trainer.model(batch[0])
else:
return trainer.model(batch[0], batch[1])

View File

@ -0,0 +1,72 @@
# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from __future__ import absolute_import, division, print_function
import datetime
from ppcls.utils import logger
from ppcls.utils.misc import AverageMeter
def update_metric(trainer, out, batch, batch_size):
# calc metric
if trainer.train_metric_func is not None:
metric_dict = trainer.train_metric_func(out, batch[-1])
for key in metric_dict:
if key not in trainer.output_info:
trainer.output_info[key] = AverageMeter(key, '7.5f')
trainer.output_info[key].update(metric_dict[key].numpy()[0],
batch_size)
def update_loss(trainer, loss_dict, batch_size):
# update_output_info
for key in loss_dict:
if key not in trainer.output_info:
trainer.output_info[key] = AverageMeter(key, '7.5f')
trainer.output_info[key].update(loss_dict[key].numpy()[0], batch_size)
def log_info(trainer, batch_size, epoch_id, iter_id):
lr_msg = "lr: {:.5f}".format(trainer.lr_sch.get_lr())
metric_msg = ", ".join([
"{}: {:.5f}".format(key, trainer.output_info[key].avg)
for key in trainer.output_info
])
time_msg = "s, ".join([
"{}: {:.5f}".format(key, trainer.time_info[key].avg)
for key in trainer.time_info
])
ips_msg = "ips: {:.5f} images/sec".format(
batch_size / trainer.time_info["batch_cost"].avg)
eta_sec = ((trainer.config["Global"]["epochs"] - epoch_id + 1
) * len(trainer.train_dataloader) - iter_id
) * trainer.time_info["batch_cost"].avg
eta_msg = "eta: {:s}".format(str(datetime.timedelta(seconds=int(eta_sec))))
logger.info("[Train][Epoch {}/{}][Iter: {}/{}]{}, {}, {}, {}, {}".format(
epoch_id, trainer.config["Global"]["epochs"], iter_id,
len(trainer.train_dataloader), lr_msg, metric_msg, time_msg, ips_msg,
eta_msg))
logger.scaler(
name="lr",
value=trainer.lr_sch.get_lr(),
step=trainer.global_step,
writer=trainer.vdl_writer)
for key in trainer.output_info:
logger.scaler(
name="train_{}".format(key),
value=trainer.output_info[key].avg,
step=trainer.global_step,
writer=trainer.vdl_writer)

View File

@ -1,662 +0,0 @@
# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import os
import sys
import numpy as np
__dir__ = os.path.dirname(os.path.abspath(__file__))
sys.path.append(os.path.abspath(os.path.join(__dir__, '../../')))
import time
import platform
import datetime
import argparse
import paddle
import paddle.nn as nn
import paddle.distributed as dist
from visualdl import LogWriter
from ppcls.utils.check import check_gpu
from ppcls.utils.misc import AverageMeter
from ppcls.utils import logger
from ppcls.utils.logger import init_logger
from ppcls.utils.config import print_config
from ppcls.data import build_dataloader
from ppcls.arch import build_model
from ppcls.arch import apply_to_static
from ppcls.loss import build_loss
from ppcls.metric import build_metrics
from ppcls.optimizer import build_optimizer
from ppcls.utils.save_load import load_dygraph_pretrain, load_dygraph_pretrain_from_url
from ppcls.utils.save_load import init_model
from ppcls.utils import save_load
from ppcls.data.utils.get_image_list import get_image_list
from ppcls.data.postprocess import build_postprocess
from ppcls.data import create_operators
class Trainer(object):
def __init__(self, config, mode="train"):
self.mode = mode
self.config = config
self.output_dir = self.config['Global']['output_dir']
log_file = os.path.join(self.output_dir, self.config["Arch"]["name"],
f"{mode}.log")
init_logger(name='root', log_file=log_file)
print_config(config)
# set device
assert self.config["Global"]["device"] in ["cpu", "gpu", "xpu"]
self.device = paddle.set_device(self.config["Global"]["device"])
# set dist
self.config["Global"][
"distributed"] = paddle.distributed.get_world_size() != 1
if self.config["Global"]["distributed"]:
dist.init_parallel_env()
if "Head" in self.config["Arch"]:
self.is_rec = True
else:
self.is_rec = False
self.model = build_model(self.config["Arch"])
# set @to_static for benchmark, skip this by default.
apply_to_static(self.config, self.model)
if self.config["Global"]["pretrained_model"] is not None:
if self.config["Global"]["pretrained_model"].startswith("http"):
load_dygraph_pretrain_from_url(
self.model, self.config["Global"]["pretrained_model"])
else:
load_dygraph_pretrain(
self.model, self.config["Global"]["pretrained_model"])
if self.config["Global"]["distributed"]:
self.model = paddle.DataParallel(self.model)
self.vdl_writer = None
if self.config['Global']['use_visualdl'] and mode == "train":
vdl_writer_path = os.path.join(self.output_dir, "vdl")
if not os.path.exists(vdl_writer_path):
os.makedirs(vdl_writer_path)
self.vdl_writer = LogWriter(logdir=vdl_writer_path)
logger.info('train with paddle {} and device {}'.format(
paddle.__version__, self.device))
# init members
self.train_dataloader = None
self.eval_dataloader = None
self.gallery_dataloader = None
self.query_dataloader = None
self.eval_mode = self.config["Global"].get("eval_mode",
"classification")
self.amp = True if "AMP" in self.config else False
if self.amp and self.config["AMP"] is not None:
self.scale_loss = self.config["AMP"].get("scale_loss", 1.0)
self.use_dynamic_loss_scaling = self.config["AMP"].get(
"use_dynamic_loss_scaling", False)
else:
self.scale_loss = 1.0
self.use_dynamic_loss_scaling = False
if self.amp:
AMP_RELATED_FLAGS_SETTING = {
'FLAGS_cudnn_batchnorm_spatial_persistent': 1,
'FLAGS_max_inplace_grad_add': 8,
}
paddle.fluid.set_flags(AMP_RELATED_FLAGS_SETTING)
self.train_loss_func = None
self.eval_loss_func = None
self.train_metric_func = None
self.eval_metric_func = None
self.use_dali = self.config['Global'].get("use_dali", False)
def train(self):
# build train loss and metric info
if self.train_loss_func is None:
loss_info = self.config["Loss"]["Train"]
self.train_loss_func = build_loss(loss_info)
if self.train_metric_func is None:
metric_config = self.config.get("Metric")
if metric_config is not None:
metric_config = metric_config.get("Train")
if metric_config is not None:
self.train_metric_func = build_metrics(metric_config)
if self.train_dataloader is None:
self.train_dataloader = build_dataloader(
self.config["DataLoader"], "Train", self.device, self.use_dali)
step_each_epoch = len(self.train_dataloader)
optimizer, lr_sch = build_optimizer(self.config["Optimizer"],
self.config["Global"]["epochs"],
step_each_epoch,
self.model.parameters())
print_batch_step = self.config['Global']['print_batch_step']
save_interval = self.config["Global"]["save_interval"]
best_metric = {
"metric": 0.0,
"epoch": 0,
}
# key:
# val: metrics list word
output_info = dict()
time_info = {
"batch_cost": AverageMeter(
"batch_cost", '.5f', postfix=" s,"),
"reader_cost": AverageMeter(
"reader_cost", ".5f", postfix=" s,"),
}
# global iter counter
global_step = 0
if self.config["Global"]["checkpoints"] is not None:
metric_info = init_model(self.config["Global"], self.model,
optimizer)
if metric_info is not None:
best_metric.update(metric_info)
# for amp training
if self.amp:
scaler = paddle.amp.GradScaler(
init_loss_scaling=self.scale_loss,
use_dynamic_loss_scaling=self.use_dynamic_loss_scaling)
tic = time.time()
max_iter = len(self.train_dataloader) - 1 if platform.system(
) == "Windows" else len(self.train_dataloader)
for epoch_id in range(best_metric["epoch"] + 1,
self.config["Global"]["epochs"] + 1):
acc = 0.0
train_dataloader = self.train_dataloader if self.use_dali else self.train_dataloader(
)
for iter_id, batch in enumerate(train_dataloader):
if iter_id >= max_iter:
break
if iter_id == 5:
for key in time_info:
time_info[key].reset()
time_info["reader_cost"].update(time.time() - tic)
if self.use_dali:
batch = [
paddle.to_tensor(batch[0]['data']),
paddle.to_tensor(batch[0]['label'])
]
batch_size = batch[0].shape[0]
batch[1] = batch[1].reshape([-1, 1]).astype("int64")
global_step += 1
# image input
if self.amp:
with paddle.amp.auto_cast(custom_black_list={
"flatten_contiguous_range", "greater_than"
}):
out = self.forward(batch)
loss_dict = self.train_loss_func(out, batch[1])
else:
out = self.forward(batch)
# calc loss
if self.config["DataLoader"]["Train"]["dataset"].get(
"batch_transform_ops", None):
loss_dict = self.train_loss_func(out, batch[1:])
else:
loss_dict = self.train_loss_func(out, batch[1])
for key in loss_dict:
if not key in output_info:
output_info[key] = AverageMeter(key, '7.5f')
output_info[key].update(loss_dict[key].numpy()[0],
batch_size)
# calc metric
if self.train_metric_func is not None:
metric_dict = self.train_metric_func(out, batch[-1])
for key in metric_dict:
if not key in output_info:
output_info[key] = AverageMeter(key, '7.5f')
output_info[key].update(metric_dict[key].numpy()[0],
batch_size)
# step opt and lr
if self.amp:
scaled = scaler.scale(loss_dict["loss"])
scaled.backward()
scaler.minimize(optimizer, scaled)
else:
loss_dict["loss"].backward()
optimizer.step()
optimizer.clear_grad()
lr_sch.step()
time_info["batch_cost"].update(time.time() - tic)
if iter_id % print_batch_step == 0:
lr_msg = "lr: {:.5f}".format(lr_sch.get_lr())
metric_msg = ", ".join([
"{}: {:.5f}".format(key, output_info[key].avg)
for key in output_info
])
time_msg = "s, ".join([
"{}: {:.5f}".format(key, time_info[key].avg)
for key in time_info
])
ips_msg = "ips: {:.5f} images/sec".format(
batch_size / time_info["batch_cost"].avg)
eta_sec = ((self.config["Global"]["epochs"] - epoch_id + 1
) * len(self.train_dataloader) - iter_id
) * time_info["batch_cost"].avg
eta_msg = "eta: {:s}".format(
str(datetime.timedelta(seconds=int(eta_sec))))
logger.info(
"[Train][Epoch {}/{}][Iter: {}/{}]{}, {}, {}, {}, {}".
format(epoch_id, self.config["Global"][
"epochs"], iter_id,
len(self.train_dataloader), lr_msg, metric_msg,
time_msg, ips_msg, eta_msg))
logger.scaler(
name="lr",
value=lr_sch.get_lr(),
step=global_step,
writer=self.vdl_writer)
for key in output_info:
logger.scaler(
name="train_{}".format(key),
value=output_info[key].avg,
step=global_step,
writer=self.vdl_writer)
tic = time.time()
if self.use_dali:
self.train_dataloader.reset()
metric_msg = ", ".join([
"{}: {:.5f}".format(key, output_info[key].avg)
for key in output_info
])
logger.info("[Train][Epoch {}/{}][Avg]{}".format(
epoch_id, self.config["Global"]["epochs"], metric_msg))
output_info.clear()
# eval model and save model if possible
if self.config["Global"][
"eval_during_train"] and epoch_id % self.config["Global"][
"eval_interval"] == 0:
acc = self.eval(epoch_id)
if acc > best_metric["metric"]:
best_metric["metric"] = acc
best_metric["epoch"] = epoch_id
save_load.save_model(
self.model,
optimizer,
best_metric,
self.output_dir,
model_name=self.config["Arch"]["name"],
prefix="best_model")
logger.info("[Eval][Epoch {}][best metric: {}]".format(
epoch_id, best_metric["metric"]))
logger.scaler(
name="eval_acc",
value=acc,
step=epoch_id,
writer=self.vdl_writer)
self.model.train()
# save model
if epoch_id % save_interval == 0:
save_load.save_model(
self.model,
optimizer, {"metric": acc,
"epoch": epoch_id},
self.output_dir,
model_name=self.config["Arch"]["name"],
prefix="epoch_{}".format(epoch_id))
# save the latest model
save_load.save_model(
self.model,
optimizer, {"metric": acc,
"epoch": epoch_id},
self.output_dir,
model_name=self.config["Arch"]["name"],
prefix="latest")
if self.vdl_writer is not None:
self.vdl_writer.close()
def build_avg_metrics(self, info_dict):
return {key: AverageMeter(key, '7.5f') for key in info_dict}
@paddle.no_grad()
def eval(self, epoch_id=0):
self.model.eval()
if self.eval_loss_func is None:
loss_config = self.config.get("Loss", None)
if loss_config is not None:
loss_config = loss_config.get("Eval")
if loss_config is not None:
self.eval_loss_func = build_loss(loss_config)
if self.eval_mode == "classification":
if self.eval_dataloader is None:
self.eval_dataloader = build_dataloader(
self.config["DataLoader"], "Eval", self.device,
self.use_dali)
if self.eval_metric_func is None:
metric_config = self.config.get("Metric")
if metric_config is not None:
metric_config = metric_config.get("Eval")
if metric_config is not None:
self.eval_metric_func = build_metrics(metric_config)
eval_result = self.eval_cls(epoch_id)
elif self.eval_mode == "retrieval":
if self.gallery_dataloader is None:
self.gallery_dataloader = build_dataloader(
self.config["DataLoader"]["Eval"], "Gallery", self.device,
self.use_dali)
if self.query_dataloader is None:
self.query_dataloader = build_dataloader(
self.config["DataLoader"]["Eval"], "Query", self.device,
self.use_dali)
# build metric info
if self.eval_metric_func is None:
metric_config = self.config.get("Metric", None)
if metric_config is None:
metric_config = [{"name": "Recallk", "topk": (1, 5)}]
else:
metric_config = metric_config["Eval"]
self.eval_metric_func = build_metrics(metric_config)
eval_result = self.eval_retrieval(epoch_id)
else:
logger.warning("Invalid eval mode: {}".format(self.eval_mode))
eval_result = None
self.model.train()
return eval_result
def forward(self, batch):
if not self.is_rec:
out = self.model(batch[0])
else:
out = self.model(batch[0], batch[1])
return out
@paddle.no_grad()
def eval_cls(self, epoch_id=0):
output_info = dict()
time_info = {
"batch_cost": AverageMeter(
"batch_cost", '.5f', postfix=" s,"),
"reader_cost": AverageMeter(
"reader_cost", ".5f", postfix=" s,"),
}
print_batch_step = self.config["Global"]["print_batch_step"]
metric_key = None
tic = time.time()
eval_dataloader = self.eval_dataloader if self.use_dali else self.eval_dataloader(
)
max_iter = len(self.eval_dataloader) - 1 if platform.system(
) == "Windows" else len(self.eval_dataloader)
for iter_id, batch in enumerate(eval_dataloader):
if iter_id >= max_iter:
break
if iter_id == 5:
for key in time_info:
time_info[key].reset()
if self.use_dali:
batch = [
paddle.to_tensor(batch[0]['data']),
paddle.to_tensor(batch[0]['label'])
]
time_info["reader_cost"].update(time.time() - tic)
batch_size = batch[0].shape[0]
batch[0] = paddle.to_tensor(batch[0]).astype("float32")
batch[1] = batch[1].reshape([-1, 1]).astype("int64")
# image input
out = self.forward(batch)
# calc loss
if self.eval_loss_func is not None:
loss_dict = self.eval_loss_func(out, batch[-1])
for key in loss_dict:
if not key in output_info:
output_info[key] = AverageMeter(key, '7.5f')
output_info[key].update(loss_dict[key].numpy()[0],
batch_size)
# calc metric
if self.eval_metric_func is not None:
metric_dict = self.eval_metric_func(out, batch[-1])
if paddle.distributed.get_world_size() > 1:
for key in metric_dict:
paddle.distributed.all_reduce(
metric_dict[key],
op=paddle.distributed.ReduceOp.SUM)
metric_dict[key] = metric_dict[
key] / paddle.distributed.get_world_size()
for key in metric_dict:
if metric_key is None:
metric_key = key
if not key in output_info:
output_info[key] = AverageMeter(key, '7.5f')
output_info[key].update(metric_dict[key].numpy()[0],
batch_size)
time_info["batch_cost"].update(time.time() - tic)
if iter_id % print_batch_step == 0:
time_msg = "s, ".join([
"{}: {:.5f}".format(key, time_info[key].avg)
for key in time_info
])
ips_msg = "ips: {:.5f} images/sec".format(
batch_size / time_info["batch_cost"].avg)
metric_msg = ", ".join([
"{}: {:.5f}".format(key, output_info[key].val)
for key in output_info
])
logger.info("[Eval][Epoch {}][Iter: {}/{}]{}, {}, {}".format(
epoch_id, iter_id,
len(self.eval_dataloader), metric_msg, time_msg, ips_msg))
tic = time.time()
if self.use_dali:
self.eval_dataloader.reset()
metric_msg = ", ".join([
"{}: {:.5f}".format(key, output_info[key].avg)
for key in output_info
])
logger.info("[Eval][Epoch {}][Avg]{}".format(epoch_id, metric_msg))
# do not try to save best model
if self.eval_metric_func is None:
return -1
# return 1st metric in the dict
return output_info[metric_key].avg
def eval_retrieval(self, epoch_id=0):
self.model.eval()
# step1. build gallery
gallery_feas, gallery_img_id, gallery_unique_id = self._cal_feature(
name='gallery')
query_feas, query_img_id, query_query_id = self._cal_feature(
name='query')
# step2. do evaluation
sim_block_size = self.config["Global"].get("sim_block_size", 64)
sections = [sim_block_size] * (len(query_feas) // sim_block_size)
if len(query_feas) % sim_block_size:
sections.append(len(query_feas) % sim_block_size)
fea_blocks = paddle.split(query_feas, num_or_sections=sections)
if query_query_id is not None:
query_id_blocks = paddle.split(
query_query_id, num_or_sections=sections)
image_id_blocks = paddle.split(query_img_id, num_or_sections=sections)
metric_key = None
if self.eval_metric_func is None:
metric_dict = {metric_key: 0.}
else:
metric_dict = dict()
for block_idx, block_fea in enumerate(fea_blocks):
similarity_matrix = paddle.matmul(
block_fea, gallery_feas, transpose_y=True)
if query_query_id is not None:
query_id_block = query_id_blocks[block_idx]
query_id_mask = (query_id_block != gallery_unique_id.t())
image_id_block = image_id_blocks[block_idx]
image_id_mask = (image_id_block != gallery_img_id.t())
keep_mask = paddle.logical_or(query_id_mask, image_id_mask)
similarity_matrix = similarity_matrix * keep_mask.astype(
"float32")
else:
keep_mask = None
metric_tmp = self.eval_metric_func(similarity_matrix,
image_id_blocks[block_idx],
gallery_img_id, keep_mask)
for key in metric_tmp:
if key not in metric_dict:
metric_dict[key] = metric_tmp[key] * block_fea.shape[
0] / len(query_feas)
else:
metric_dict[key] += metric_tmp[key] * block_fea.shape[
0] / len(query_feas)
metric_info_list = []
for key in metric_dict:
if metric_key is None:
metric_key = key
metric_info_list.append("{}: {:.5f}".format(key, metric_dict[key]))
metric_msg = ", ".join(metric_info_list)
logger.info("[Eval][Epoch {}][Avg]{}".format(epoch_id, metric_msg))
return metric_dict[metric_key]
def _cal_feature(self, name='gallery'):
all_feas = None
all_image_id = None
all_unique_id = None
if name == 'gallery':
dataloader = self.gallery_dataloader
elif name == 'query':
dataloader = self.query_dataloader
else:
raise RuntimeError("Only support gallery or query dataset")
has_unique_id = False
max_iter = len(dataloader) - 1 if platform.system(
) == "Windows" else len(dataloader)
dataloader_tmp = dataloader if self.use_dali else dataloader()
for idx, batch in enumerate(
dataloader_tmp): # load is very time-consuming
if idx >= max_iter:
break
if idx % self.config["Global"]["print_batch_step"] == 0:
logger.info(
f"{name} feature calculation process: [{idx}/{len(dataloader)}]"
)
if self.use_dali:
batch = [
paddle.to_tensor(batch[0]['data']),
paddle.to_tensor(batch[0]['label'])
]
batch = [paddle.to_tensor(x) for x in batch]
batch[1] = batch[1].reshape([-1, 1]).astype("int64")
if len(batch) == 3:
has_unique_id = True
batch[2] = batch[2].reshape([-1, 1]).astype("int64")
out = self.forward(batch)
batch_feas = out["features"]
# do norm
if self.config["Global"].get("feature_normalize", True):
feas_norm = paddle.sqrt(
paddle.sum(paddle.square(batch_feas), axis=1,
keepdim=True))
batch_feas = paddle.divide(batch_feas, feas_norm)
if all_feas is None:
all_feas = batch_feas
if has_unique_id:
all_unique_id = batch[2]
all_image_id = batch[1]
else:
all_feas = paddle.concat([all_feas, batch_feas])
all_image_id = paddle.concat([all_image_id, batch[1]])
if has_unique_id:
all_unique_id = paddle.concat([all_unique_id, batch[2]])
if self.use_dali:
dataloader_tmp.reset()
if paddle.distributed.get_world_size() > 1:
feat_list = []
img_id_list = []
unique_id_list = []
paddle.distributed.all_gather(feat_list, all_feas)
paddle.distributed.all_gather(img_id_list, all_image_id)
all_feas = paddle.concat(feat_list, axis=0)
all_image_id = paddle.concat(img_id_list, axis=0)
if has_unique_id:
paddle.distributed.all_gather(unique_id_list, all_unique_id)
all_unique_id = paddle.concat(unique_id_list, axis=0)
logger.info("Build {} done, all feat shape: {}, begin to eval..".
format(name, all_feas.shape))
return all_feas, all_image_id, all_unique_id
@paddle.no_grad()
def infer(self, ):
total_trainer = paddle.distributed.get_world_size()
local_rank = paddle.distributed.get_rank()
image_list = get_image_list(self.config["Infer"]["infer_imgs"])
# data split
image_list = image_list[local_rank::total_trainer]
preprocess_func = create_operators(self.config["Infer"]["transforms"])
postprocess_func = build_postprocess(self.config["Infer"][
"PostProcess"])
batch_size = self.config["Infer"]["batch_size"]
self.model.eval()
batch_data = []
image_file_list = []
for idx, image_file in enumerate(image_list):
with open(image_file, 'rb') as f:
x = f.read()
for process in preprocess_func:
x = process(x)
batch_data.append(x)
image_file_list.append(image_file)
if len(batch_data) >= batch_size or idx == len(image_list) - 1:
batch_tensor = paddle.to_tensor(batch_data)
out = self.forward([batch_tensor])
if isinstance(out, list):
out = out[0]
result = postprocess_func(out, image_file_list)
print(result)
batch_data.clear()
image_file_list.clear()

View File

@ -0,0 +1,90 @@
#copyright (c) 2021 PaddlePaddle Authors. All Rights Reserve.
#
#Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License.
#You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
#Unless required by applicable law or agreed to in writing, software
#distributed under the License is distributed on an "AS IS" BASIS,
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and
#limitations under the License.
import paddle
import paddle.nn as nn
class DSHSDLoss(nn.Layer):
"""
# DSHSD(IEEE ACCESS 2019)
# paper [Deep Supervised Hashing Based on Stable Distribution](https://ieeexplore.ieee.org/document/8648432/)
# [DSHSD] epoch:70, bit:48, dataset:cifar10-1, MAP:0.809, Best MAP: 0.809
# [DSHSD] epoch:250, bit:48, dataset:nuswide_21, MAP:0.809, Best MAP: 0.815
# [DSHSD] epoch:135, bit:48, dataset:imagenet, MAP:0.647, Best MAP: 0.647
"""
def __init__(self, n_class, bit, alpha, multi_label=False):
super(DSHSDLoss, self).__init__()
self.m = 2 * bit
self.alpha = alpha
self.multi_label = multi_label
self.n_class = n_class
self.fc = paddle.nn.Linear(bit, n_class, bias_attr=False)
def forward(self, input, label):
feature = input["features"]
feature = feature.tanh().astype("float32")
dist = paddle.sum(
paddle.square((paddle.unsqueeze(feature, 1) - paddle.unsqueeze(feature, 0))),
axis=2)
# label to ont-hot
label = paddle.flatten(label)
label = paddle.nn.functional.one_hot(label, self.n_class).astype("float32")
s = (paddle.matmul(label, label, transpose_y=True) == 0).astype("float32")
Ld = (1 - s) / 2 * dist + s / 2 * (self.m - dist).clip(min=0)
Ld = Ld.mean()
logits = self.fc(feature)
if self.multi_label:
# multiple labels classification loss
Lc = (logits - label * logits + ((1 + (-logits).exp()).log())).sum(axis=1).mean()
else:
# single labels classification loss
Lc = (-paddle.nn.functional.softmax(logits).log() * label).sum(axis=1).mean()
return {"dshsdloss": Lc + Ld * self.alpha}
class LCDSHLoss(nn.Layer):
"""
# paper [Locality-Constrained Deep Supervised Hashing for Image Retrieval](https://www.ijcai.org/Proceedings/2017/0499.pdf)
# [LCDSH] epoch:145, bit:48, dataset:cifar10-1, MAP:0.798, Best MAP: 0.798
# [LCDSH] epoch:183, bit:48, dataset:nuswide_21, MAP:0.833, Best MAP: 0.834
"""
def __init__(self, n_class, _lambda):
super(LCDSHLoss, self).__init__()
self._lambda = _lambda
self.n_class = n_class
def forward(self, input, label):
feature = input["features"]
# label to ont-hot
label = paddle.flatten(label)
label = paddle.nn.functional.one_hot(label, self.n_class).astype("float32")
s = 2 * (paddle.matmul(label, label, transpose_y=True) > 0).astype("float32") - 1
inner_product = paddle.matmul(feature, feature, transpose_y=True) * 0.5
inner_product = inner_product.clip(min=-50, max=50)
L1 = paddle.log(1 + paddle.exp(-s * inner_product)).mean()
b = feature.sign()
inner_product_ = paddle.matmul(b, b, transpose_y=True) * 0.5
sigmoid = paddle.nn.Sigmoid()
L2 = (sigmoid(inner_product) - sigmoid(inner_product_)).pow(2).mean()
return {"lcdshloss": L1 + self._lambda * L2}

View File

@ -16,7 +16,7 @@ from paddle import nn
import copy
from collections import OrderedDict
from .metrics import TopkAcc, mAP, mINP, Recallk
from .metrics import TopkAcc, mAP, mINP, Recallk, Precisionk
from .metrics import DistillationTopkAcc
from .metrics import GoogLeNetTopkAcc

View File

@ -168,6 +168,47 @@ class Recallk(nn.Layer):
return metric_dict
class Precisionk(nn.Layer):
def __init__(self, topk=(1, 5)):
super().__init__()
assert isinstance(topk, (int, list, tuple))
if isinstance(topk, int):
topk = [topk]
self.topk = topk
def forward(self, similarities_matrix, query_img_id, gallery_img_id,
keep_mask):
metric_dict = dict()
#get cmc
choosen_indices = paddle.argsort(
similarities_matrix, axis=1, descending=True)
gallery_labels_transpose = paddle.transpose(gallery_img_id, [1, 0])
gallery_labels_transpose = paddle.broadcast_to(
gallery_labels_transpose,
shape=[
choosen_indices.shape[0], gallery_labels_transpose.shape[1]
])
choosen_label = paddle.index_sample(gallery_labels_transpose,
choosen_indices)
equal_flag = paddle.equal(choosen_label, query_img_id)
if keep_mask is not None:
keep_mask = paddle.index_sample(
keep_mask.astype('float32'), choosen_indices)
equal_flag = paddle.logical_and(equal_flag,
keep_mask.astype('bool'))
equal_flag = paddle.cast(equal_flag, 'float32')
Ns = paddle.arange(gallery_img_id.shape[0]) + 1
equal_flag_cumsum = paddle.cumsum(equal_flag, axis=1)
Precision_at_k = (paddle.mean(equal_flag_cumsum, axis=0) / Ns).numpy()
for k in self.topk:
metric_dict["precision@{}".format(k)] = Precision_at_k[k - 1]
return metric_dict
class DistillationTopkAcc(TopkAcc):
def __init__(self, model_key, feature_key=None, topk=(1, 5)):
super().__init__(topk=topk)

View File

@ -21,11 +21,11 @@ __dir__ = os.path.dirname(os.path.abspath(__file__))
sys.path.append(os.path.abspath(os.path.join(__dir__, '../')))
from ppcls.utils import config
from ppcls.engine.trainer import Trainer
from ppcls.engine.engine import Engine
if __name__ == "__main__":
args = config.parse_args()
config = config.get_config(
args.config, overrides=args.override, show=False)
trainer = Trainer(config, mode="eval")
trainer.eval()
engine = Engine(config, mode="eval")
engine.eval()

View File

@ -24,82 +24,11 @@ import paddle
import paddle.nn as nn
from ppcls.utils import config
from ppcls.utils.logger import init_logger
from ppcls.utils.config import print_config
from ppcls.arch import build_model, RecModel, DistillationModel
from ppcls.utils.save_load import load_dygraph_pretrain
from ppcls.arch.gears.identity_head import IdentityHead
class ExportModel(nn.Layer):
"""
ExportModel: add softmax onto the model
"""
def __init__(self, config):
super().__init__()
self.base_model = build_model(config)
# we should choose a final model to export
if isinstance(self.base_model, DistillationModel):
self.infer_model_name = config["infer_model_name"]
else:
self.infer_model_name = None
self.infer_output_key = config.get("infer_output_key", None)
if self.infer_output_key == "features" and isinstance(self.base_model,
RecModel):
self.base_model.head = IdentityHead()
if config.get("infer_add_softmax", True):
self.softmax = nn.Softmax(axis=-1)
else:
self.softmax = None
def eval(self):
self.training = False
for layer in self.sublayers():
layer.training = False
layer.eval()
def forward(self, x):
x = self.base_model(x)
if isinstance(x, list):
x = x[0]
if self.infer_model_name is not None:
x = x[self.infer_model_name]
if self.infer_output_key is not None:
x = x[self.infer_output_key]
if self.softmax is not None:
x = self.softmax(x)
return x
from ppcls.engine.engine import Engine
if __name__ == "__main__":
args = config.parse_args()
config = config.get_config(
args.config, overrides=args.override, show=False)
log_file = os.path.join(config['Global']['output_dir'],
config["Arch"]["name"], "export.log")
init_logger(name='root', log_file=log_file)
print_config(config)
# set device
assert config["Global"]["device"] in ["cpu", "gpu", "xpu"]
device = paddle.set_device(config["Global"]["device"])
model = ExportModel(config["Arch"])
if config["Global"]["pretrained_model"] is not None:
load_dygraph_pretrain(model.base_model,
config["Global"]["pretrained_model"])
model.eval()
model = paddle.jit.to_static(
model,
input_spec=[
paddle.static.InputSpec(
shape=[None] + config["Global"]["image_shape"],
dtype='float32')
])
paddle.jit.save(model,
os.path.join(config["Global"]["save_inference_dir"],
"inference"))
engine = Engine(config, mode="export")
engine.export()

View File

@ -21,12 +21,11 @@ __dir__ = os.path.dirname(os.path.abspath(__file__))
sys.path.append(os.path.abspath(os.path.join(__dir__, '../')))
from ppcls.utils import config
from ppcls.engine.trainer import Trainer
from ppcls.engine.engine import Engine
if __name__ == "__main__":
args = config.parse_args()
config = config.get_config(
args.config, overrides=args.override, show=False)
trainer = Trainer(config, mode="infer")
trainer.infer()
engine = Engine(config, mode="infer")
engine.infer()

View File

@ -21,11 +21,11 @@ __dir__ = os.path.dirname(os.path.abspath(__file__))
sys.path.append(os.path.abspath(os.path.join(__dir__, '../')))
from ppcls.utils import config
from ppcls.engine.trainer import Trainer
from ppcls.engine.engine import Engine
if __name__ == "__main__":
args = config.parse_args()
config = config.get_config(
args.config, overrides=args.override, show=False)
trainer = Trainer(config, mode="train")
trainer.train()
engine = Engine(config, mode="train")
engine.train()