From d1f3622c50764522795a738a914cd0ea5bae7e43 Mon Sep 17 00:00:00 2001 From: gaotingquan Date: Thu, 16 Jun 2022 11:46:36 +0000 Subject: [PATCH 1/2] docs: fix --- docs/zh_CN/PULC/PULC_safety_helmet.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/zh_CN/PULC/PULC_safety_helmet.md b/docs/zh_CN/PULC/PULC_safety_helmet.md index 3cdd247ef..0467b61b1 100644 --- a/docs/zh_CN/PULC/PULC_safety_helmet.md +++ b/docs/zh_CN/PULC/PULC_safety_helmet.md @@ -374,7 +374,7 @@ cd ../ ```shell # 使用下面的命令使用 GPU 进行预测 -c +python3.7 python/predict_cls.py -c configs/PULC/safety_helmet/inference_safety_helmet.yaml # 使用下面的命令使用 CPU 进行预测 python3.7 python/predict_cls.py -c configs/PULC/safety_helmet/inference_safety_helmet.yaml -o Global.use_gpu=False ``` From db5c20f63bc1f9ef67e1e71ab1ba81f4f2ec529d Mon Sep 17 00:00:00 2001 From: gaotingquan Date: Thu, 16 Jun 2022 12:03:12 +0000 Subject: [PATCH 2/2] docs: fix --- docs/en/PULC/PULC_car_exists_en.md | 14 +++---- .../PULC/PULC_language_classification_en.md | 14 +++---- docs/en/PULC/PULC_model_list_en.md | 14 +++---- docs/en/PULC/PULC_person_attribute_en.md | 16 ++++---- docs/en/PULC/PULC_person_exists_en.md | 16 ++++---- docs/en/PULC/PULC_safety_helmet_en.md | 16 ++++---- .../en/PULC/PULC_text_image_orientation_en.md | 12 +++--- docs/en/PULC/PULC_textline_orientation_en.md | 16 ++++---- docs/en/PULC/PULC_traffic_sign_en.md | 2 +- docs/en/PULC/PULC_vehicle_attribute_en.md | 37 +++++++++---------- 10 files changed, 78 insertions(+), 79 deletions(-) diff --git a/docs/en/PULC/PULC_car_exists_en.md b/docs/en/PULC/PULC_car_exists_en.md index 3ec2e9d14..33c0932e6 100644 --- a/docs/en/PULC/PULC_car_exists_en.md +++ b/docs/en/PULC/PULC_car_exists_en.md @@ -38,18 +38,18 @@ ## 1. Introduction -This case provides a way for users to quickly build a lightweight, high-precision and practical classification model of car exists using PaddleClas PULC (Practical Ultra Lightweight Classification). The model can be widely used in monitoring scenarios, massive data filtering scenarios, etc. +This case provides a way for users to quickly build a lightweight, high-precision and practical classification model of car exists using PaddleClas PULC (Practical Ultra Lightweight image Classification). The model can be widely used in monitoring scenarios, massive data filtering scenarios, etc. The following table lists the relevant indicators of the model. The first two lines means that using SwinTransformer_tiny and MobileNetV3_small_x0_35 as the backbone to training. The third to sixth lines means that the backbone is replaced by PPLCNet, additional use of EDA strategy and additional use of EDA strategy and SKL-UGI knowledge distillation strategy. | Backbone | Tpr(%) | Latency(ms) | Size(M)| Training Strategy | |-------|----------------|----------|---------------|---------------| -| SwinTranformer_tiny | 97.71 | 95.30 | 107 | using ImageNet pretrained model | -| MobileNetV3_small_x0_35 | 81.23 | 2.85 | 1.6 | using ImageNet pretrained model | -| PPLCNet_x1_0 | 94.72 | 2.12 | 6.5 | using ImageNet pretrained model | -| PPLCNet_x1_0 | 95.48 | 2.12 | 6.5 | using SSLD pretrained model | -| PPLCNet_x1_0 | 95.48 | 2.12 | 6.5 | using SSLD pretrained model + EDA strategy | -| PPLCNet_x1_0 | 95.92 | 2.12 | 6.5 | using SSLD pretrained model + EDA strategy + SKL-UGI knowledge distillation strategy| +| SwinTranformer_tiny | 97.71 | 95.30 | 111 | using ImageNet pretrained model | +| MobileNetV3_small_x0_35 | 81.23 | 2.85 | 2.7 | using ImageNet pretrained model | +| PPLCNet_x1_0 | 94.72 | 2.12 | 7.1 | using ImageNet pretrained model | +| PPLCNet_x1_0 | 95.48 | 2.12 | 7.1 | using SSLD pretrained model | +| PPLCNet_x1_0 | 95.48 | 2.12 | 7.1 | using SSLD pretrained model + EDA strategy | +| PPLCNet_x1_0 | 95.92 | 2.12 | 7.1 | using SSLD pretrained model + EDA strategy + SKL-UGI knowledge distillation strategy| It can be seen that high Tpr can be getted when backbone is SwinTranformer_tiny, but the speed is slow. Replacing backbone with the lightweight model MobileNetV3_small_x0_35, the speed can be greatly improved, but the Tpr will be greatly reduced. Replacing backbone with faster backbone PPLCNet_x1_0, the Tpr is higher more 13 percentage points than MobileNetv3_small_x0_35. At the same time, the speed can be more than 20% faster. After additional using the SSLD pretrained model, the Tpr can be improved by about 0.7 percentage points without affecting the inference speed. Finally, after additional using the SKL-UGI knowledge distillation, the Tpr can be further improved by 0.44 percentage points. At this point, the Tpr is close to that of SwinTranformer_tiny, but the speed is more than 40 times faster. The training method and deployment instructions of PULC will be introduced in detail below. diff --git a/docs/en/PULC/PULC_language_classification_en.md b/docs/en/PULC/PULC_language_classification_en.md index 48362b316..c7cd5f5db 100644 --- a/docs/en/PULC/PULC_language_classification_en.md +++ b/docs/en/PULC/PULC_language_classification_en.md @@ -38,18 +38,18 @@ ## 1. Introduction -This case provides a way for users to quickly build a lightweight, high-precision and practical classification model of language in the image using PaddleClas PULC (Practical Ultra Lightweight Classification). The model can be widely used in various scenarios involving multilingual OCR processing, such as finance and government affairs. +This case provides a way for users to quickly build a lightweight, high-precision and practical classification model of language in the image using PaddleClas PULC (Practical Ultra Lightweight image Classification). The model can be widely used in various scenarios involving multilingual OCR processing, such as finance and government affairs. The following table lists the relevant indicators of the model. The first two lines means that using SwinTransformer_tiny and MobileNetV3_small_x0_35 as the backbone to training. The third to sixth lines means that the backbone is replaced by PPLCNet, additional use of EDA strategy and additional use of EDA strategy and SKL-UGI knowledge distillation strategy. When replacing the backbone with PPLCNet_x1_0, the input shape of model is changed to [192, 48], and the stride of the network is changed to [2, [2, 1], [2, 1], [2, 1]]. | Backbone | Top1-Acc(%) | Latency(ms) | Size(M)| Training Strategy | | ----------------------- | --------- | -------- | ------- | ---------------------------------------------- | -| SwinTranformer_tiny | 98.12 | 89.09 | 107 | using ImageNet pretrained model | -| MobileNetV3_small_x0_35 | 95.92 | 2.98 | 17 | using ImageNet pretrained model | -| PPLCNet_x1_0 | 98.35 | 2.58 | 6.5 | using ImageNet pretrained model | -| PPLCNet_x1_0 | 98.7 | 2.58 | 6.5 | using SSLD pretrained model | -| PPLCNet_x1_0 | 99.12 | 2.58 | 6.5 | using SSLD pretrained model + EDA strategy | -| **PPLCNet_x1_0** | **99.26** | **2.58** | **6.5** | using SSLD pretrained model + EDA strategy + SKL-UGI knowledge distillation strategy| +| SwinTranformer_tiny | 98.12 | 89.09 | 111 | using ImageNet pretrained model | +| MobileNetV3_small_x0_35 | 95.92 | 2.98 | 3.7 | using ImageNet pretrained model | +| PPLCNet_x1_0 | 98.35 | 2.58 | 7.1 | using ImageNet pretrained model | +| PPLCNet_x1_0 | 98.7 | 2.58 | 7.1 | using SSLD pretrained model | +| PPLCNet_x1_0 | 99.12 | 2.58 | 7.1 | using SSLD pretrained model + EDA strategy | +| **PPLCNet_x1_0** | **99.26** | **2.58** | **7.1** | using SSLD pretrained model + EDA strategy + SKL-UGI knowledge distillation strategy| It can be seen that high accuracy can be getted when backbone is SwinTranformer_tiny, but the speed is slow. Replacing backbone with the lightweight model MobileNetV3_small_x0_35, the speed can be greatly improved, but the accuracy will be greatly reduced. Replacing backbone with faster backbone PPLCNet_x1_0 and changing the input shape and stride of network, the accuracy is higher more 2.43 percentage points than MobileNetv3_small_x0_35. At the same time, the speed can be more than 20% faster. After additional using the SSLD pretrained model, the accuracy can be improved by about 0.35 percentage points without affecting the inference speed. Further, additional using the EDA strategy, the accuracy can be increased by 0.42 percentage points. Finally, after additional using the SKL-UGI knowledge distillation, the accuracy can be further improved by 0.14 percentage points. At this point, the accuracy is higher than that of SwinTranformer_tiny, but the speed is more faster. The training method and deployment instructions of PULC will be introduced in detail below. diff --git a/docs/en/PULC/PULC_model_list_en.md b/docs/en/PULC/PULC_model_list_en.md index 6b287b523..a7de0ce2c 100644 --- a/docs/en/PULC/PULC_model_list_en.md +++ b/docs/en/PULC/PULC_model_list_en.md @@ -7,15 +7,15 @@ The PULC model zoo is provided here, mainly providing indicators, model storage |Model name| Model Description | Metrics |Storage Size| Latency| Download Address| | --- | --- | --- | --- | --- | --- | -| person_exists |[Human Exists Classification](PULC_person_exists_en.md)| 95.60 |6.5M|2.58ms|[inference model](https://paddleclas.bj.bcebos.com/models/PULC/inference/person_exists_infer.tar) / [pretrained model](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/person_exists_pretrained.pdparams)| -| person_attribute |[Pedestrian Attribute Classification](PULC_person_attribute_en.md)| 78.59 |6.6M|2.01ms|[inference model](https://paddleclas.bj.bcebos.com/models/PULC/inference/person_attribute_infer.tar) / [pretrained model](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/person_attribute_pretrained.pdparams)| -| safety_helmet |[Classification of Wheather Wearing Safety Helmet](PULC_safety_helmet_en.md)| 99.38 |6.5M|2.03ms|[inference model](https://paddleclas.bj.bcebos.com/models/PULC/inference/safety_helmet_infer.tar) / [pretrained model](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/safety_helmet_pretrained.pdparams)| +| person_exists |[Human Exists Classification](PULC_person_exists_en.md)| 96.23 |7.0M|2.58ms|[inference model](https://paddleclas.bj.bcebos.com/models/PULC/inference/person_exists_infer.tar) / [pretrained model](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/person_exists_pretrained.pdparams)| +| person_attribute |[Pedestrian Attribute Classification](PULC_person_attribute_en.md)| 78.59 |7.2M|2.01ms|[inference model](https://paddleclas.bj.bcebos.com/models/PULC/inference/person_attribute_infer.tar) / [pretrained model](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/person_attribute_pretrained.pdparams)| +| safety_helmet |[Classification of Wheather Wearing Safety Helmet](PULC_safety_helmet_en.md)| 99.38 |7.1M|2.03ms|[inference model](https://paddleclas.bj.bcebos.com/models/PULC/inference/safety_helmet_infer.tar) / [pretrained model](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/safety_helmet_pretrained.pdparams)| | traffic_sign |[Traffic Sign Classification](PULC_traffic_sign_en.md)| 98.35 |8.2M|2.10ms|[inference model](https://paddleclas.bj.bcebos.com/models/PULC/inference/traffic_sign_infer.tar) / [pretrained model](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/traffic_sign_pretrained.pdparams)| | vehicle_attribute |[Vehicle Attribute Classification](PULC_vehicle_attribute_en.md)| 90.81 |7.2M|2.36ms|[inference model](https://paddleclas.bj.bcebos.com/models/PULC/inference/vehicle_attribute_infer.tar) / [pretrained model](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/vehicle_attribute_pretrained.pdparams)| -| car_exists |[Car Exists Classification](PULC_car_exists_en.md) | 95.92 | 6.6M | 2.38ms |[inference model](https://paddleclas.bj.bcebos.com/models/PULC/inference/car_exists_infer.tar) / [pretrained model](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/car_exists_pretrained.pdparams)| -| text_image_orientation |[Text Image Orientation Classification](PULC_text_image_orientation_en.md)| 99.06 | 6.5M | 2.16ms |[inference model](https://paddleclas.bj.bcebos.com/models/PULC/inference/text_image_orientation_infer.tar) / [pretrained model](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/text_image_orientation_pretrained.pdparams)| -| textline_orientation |[Text-line Orientation Classification](PULC_textline_orientation_en.md)| 96.01 |6.5M|2.72ms|[inference model](https://paddleclas.bj.bcebos.com/models/PULC/inference/textline_orientation_infer.tar) / [pretrained model](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/textline_orientation_pretrained.pdparams)| -| language_classification |[Language Classification](PULC_language_classification_en.md)| 99.26 |6.5M|2.58ms|[inference model](https://paddleclas.bj.bcebos.com/models/PULC/inference/language_classification_infer.tar) / [pretrained model](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/language_classification_pretrained.pdparams)| +| car_exists |[Car Exists Classification](PULC_car_exists_en.md) | 95.92 | 7.1M | 2.38ms |[inference model](https://paddleclas.bj.bcebos.com/models/PULC/inference/car_exists_infer.tar) / [pretrained model](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/car_exists_pretrained.pdparams)| +| text_image_orientation |[Text Image Orientation Classification](PULC_text_image_orientation_en.md)| 99.06 | 7.1M | 2.16ms |[inference model](https://paddleclas.bj.bcebos.com/models/PULC/inference/text_image_orientation_infer.tar) / [pretrained model](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/text_image_orientation_pretrained.pdparams)| +| textline_orientation |[Text-line Orientation Classification](PULC_textline_orientation_en.md)| 96.01 |7.0M|2.72ms|[inference model](https://paddleclas.bj.bcebos.com/models/PULC/inference/textline_orientation_infer.tar) / [pretrained model](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/textline_orientation_pretrained.pdparams)| +| language_classification |[Language Classification](PULC_language_classification_en.md)| 99.26 |7.1M|2.58ms|[inference model](https://paddleclas.bj.bcebos.com/models/PULC/inference/language_classification_infer.tar) / [pretrained model](https://paddleclas.bj.bcebos.com/models/PULC/pretrained/language_classification_pretrained.pdparams)| **Note:** diff --git a/docs/en/PULC/PULC_person_attribute_en.md b/docs/en/PULC/PULC_person_attribute_en.md index d79da893b..173313aad 100644 --- a/docs/en/PULC/PULC_person_attribute_en.md +++ b/docs/en/PULC/PULC_person_attribute_en.md @@ -38,21 +38,21 @@ ## 1. Introduction -This case provides a way for users to quickly build a lightweight, high-precision and practical classification model of person attribute using PaddleClas PULC (Practical Ultra Lightweight image Classification). The model can be widely used in +This case provides a way for users to quickly build a lightweight, high-precision and practical classification model of person attribute using PaddleClas PULC (Practical Ultra Lightweight image Classification). The model can be widely used in Pedestrian analysis scenarios, pedestrian tracking scenarios, etc. The following table lists the relevant indicators of the model. The first three lines means that using Res2Net200_vd_26w_4s, SwinTransformer_tiny and MobileNetV3_small_x0_35 as the backbone to training. The fourth to seventh lines means that the backbone is replaced by PPLCNet, additional use of EDA strategy and additional use of EDA strategy and SKL-UGI knowledge distillation strategy. - - + + | Backbone | ma(%) | Latency(ms) | Size(M) | Training Strategy | |-------|-----------|----------|---------------|---------------| | Res2Net200_vd_26w_4s | 81.25 | 77.51 | 293 | using ImageNet pretrained | -| SwinTransformer_tiny | 80.17 | 89.51 | 107 | using ImageNet pretrained | +| SwinTransformer_tiny | 80.17 | 89.51 | 111 | using ImageNet pretrained | | MobileNetV3_small_x0_35 | 70.79 | 2.90 | 1.7 | using ImageNet pretrained | -| PPLCNet_x1_0 | 76.31 | 2.01 | 6.6 | using ImageNet pretrained | -| PPLCNet_x1_0 | 77.31 | 2.01 | 6.6 | using SSLD pretrained | -| PPLCNet_x1_0 | 77.71 | 2.01 | 6.6 | using SSLD pretrained + EDA strategy| -| PPLCNet_x1_0 | 78.59 | 2.01 | 6.6 | using SSLD pretrained + EDA strategy + SKL-UGI knowledge distillation strategy| +| PPLCNet_x1_0 | 76.31 | 2.01 | 7.1 | using ImageNet pretrained | +| PPLCNet_x1_0 | 77.31 | 2.01 | 7.1 | using SSLD pretrained | +| PPLCNet_x1_0 | 77.71 | 2.01 | 7.1 | using SSLD pretrained + EDA strategy| +| PPLCNet_x1_0 | 78.59 | 2.01 | 7.1 | using SSLD pretrained + EDA strategy + SKL-UGI knowledge distillation strategy| It can be seen that high ma metric can be getted when backbone are Res2Net200_vd_26w_4s and SwinTranformer_tiny, but the speed is slow. Replacing backbone with the lightweight model MobileNetV3_small_x0_35, the speed can be greatly improved, but the ma metric will be greatly reduced. Replacing backbone with faster backbone PPLCNet_x1_0, the ma metric is higher more 5.5 percentage points higher than MobileNetv3_small_x0_35. At the same time, the speed can be more than 20% faster. After additional using the SSLD pretrained model, the ma metric can be improved by about 1 percentage points without affecting the inference speed. Further, additional using the EDA strategy, the ma metric can be increased by 0.4 percentage points. Finally, after additional using the SKL-UGI knowledge distillation, the ma matric can be further improved by 0.88 percentage points. At this time, the ma metric of PPLCNet_x1_0 is only 1.58% different from SwinTransformer_tiny, but the speed is more than 44 times faster. The training method and deployment instructions of PULC will be introduced in detail below. diff --git a/docs/en/PULC/PULC_person_exists_en.md b/docs/en/PULC/PULC_person_exists_en.md index 21829e554..baf5ce3e4 100644 --- a/docs/en/PULC/PULC_person_exists_en.md +++ b/docs/en/PULC/PULC_person_exists_en.md @@ -38,20 +38,20 @@ ## 1. Introduction -This case provides a way for users to quickly build a lightweight, high-precision and practical classification model of human exists using PaddleClas PULC (Practical Ultra Lightweight Classification). The model can be widely used in monitoring scenarios, personnel access control scenarios, massive data filtering scenarios, etc. +This case provides a way for users to quickly build a lightweight, high-precision and practical classification model of human exists using PaddleClas PULC (Practical Ultra Lightweight image Classification). The model can be widely used in monitoring scenarios, personnel access control scenarios, massive data filtering scenarios, etc. The following table lists the relevant indicators of the model. The first two lines means that using SwinTransformer_tiny and MobileNetV3_small_x0_35 as the backbone to training. The third to sixth lines means that the backbone is replaced by PPLCNet, additional use of EDA strategy and additional use of EDA strategy and SKL-UGI knowledge distillation strategy. | Backbone | Tpr(%) | Latency(ms) | Size(M)| Training Strategy | |-------|-----------|----------|---------------|---------------| -| SwinTranformer_tiny | 95.69 | 95.30 | 107 | using ImageNet pretrained model | -| MobileNetV3_small_x0_35 | 68.25 | 2.85 | 1.6 | using ImageNet pretrained model | -| PPLCNet_x1_0 | 89.57 | 2.12 | 6.5 | using ImageNet pretrained model | -| PPLCNet_x1_0 | 92.10 | 2.12 | 6.5 | using SSLD pretrained model | -| PPLCNet_x1_0 | 93.43 | 2.12 | 6.5 | using SSLD pretrained model + EDA strategy | -| PPLCNet_x1_0 | 95.60 | 2.12 | 6.5 | using SSLD pretrained model + EDA strategy + SKL-UGI knowledge distillation strategy| +| SwinTranformer_tiny | 95.69 | 95.30 | 111 | using ImageNet pretrained model | +| MobileNetV3_small_x0_35 | 68.25 | 2.85 | 2.6 | using ImageNet pretrained model | +| PPLCNet_x1_0 | 89.57 | 2.12 | 7.0 | using ImageNet pretrained model | +| PPLCNet_x1_0 | 92.10 | 2.12 | 7.0 | using SSLD pretrained model | +| PPLCNet_x1_0 | 93.43 | 2.12 | 7.0 | using SSLD pretrained model + EDA strategy | +| PPLCNet_x1_0 | 96.23 | 2.12 | 7.0 | using SSLD pretrained model + EDA strategy + SKL-UGI knowledge distillation strategy| -It can be seen that high Tpr can be getted when backbone is SwinTranformer_tiny, but the speed is slow. Replacing backbone with the lightweight model MobileNetV3_small_x0_35, the speed can be greatly improved, but the Tpr will be greatly reduced. Replacing backbone with faster backbone PPLCNet_x1_0, the Tpr is higher more 20 percentage points than MobileNetv3_small_x0_35. At the same time, the speed can be more than 20% faster. After additional using the SSLD pretrained model, the Tpr can be improved by about 2.6 percentage points without affecting the inference speed. Further, additional using the EDA strategy, the Tpr can be increased by 1.3 percentage points. Finally, after additional using the SKL-UGI knowledge distillation, the Tpr can be further improved by 2.2 percentage points. At this point, the Tpr is close to that of SwinTranformer_tiny, but the speed is more than 40 times faster. The training method and deployment instructions of PULC will be introduced in detail below. +It can be seen that high Tpr can be getted when backbone is SwinTranformer_tiny, but the speed is slow. Replacing backbone with the lightweight model MobileNetV3_small_x0_35, the speed can be greatly improved, but the Tpr will be greatly reduced. Replacing backbone with faster backbone PPLCNet_x1_0, the Tpr is higher more 20 percentage points than MobileNetv3_small_x0_35. At the same time, the speed can be more than 20% faster. After additional using the SSLD pretrained model, the Tpr can be improved by about 2.6 percentage points without affecting the inference speed. Further, additional using the EDA strategy, the Tpr can be increased by 1.3 percentage points. Finally, after additional using the SKL-UGI knowledge distillation, the Tpr can be further improved by 2.8 percentage points. At this point, the Tpr is close to that of SwinTranformer_tiny, but the speed is more than 40 times faster. The training method and deployment instructions of PULC will be introduced in detail below. **Note**: diff --git a/docs/en/PULC/PULC_safety_helmet_en.md b/docs/en/PULC/PULC_safety_helmet_en.md index 91f8b76f6..d2e5cb329 100644 --- a/docs/en/PULC/PULC_safety_helmet_en.md +++ b/docs/en/PULC/PULC_safety_helmet_en.md @@ -38,19 +38,19 @@ ## 1. Introduction -This case provides a way for users to quickly build a lightweight, high-precision and practical classification model of wheather wearing safety helmet using PaddleClas PULC (Practical Ultra Lightweight Classification). The model can be widely used in construction scenes, factory workshop scenes, traffic scenes and so on. +This case provides a way for users to quickly build a lightweight, high-precision and practical classification model of wheather wearing safety helmet using PaddleClas PULC (Practical Ultra Lightweight image Classification). The model can be widely used in construction scenes, factory workshop scenes, traffic scenes and so on. -The following table lists the relevant indicators of the model. The first two lines means that using SwinTransformer_tiny and MobileNetV3_small_x0_35 as the backbone to training. The third to seventh lines means that the backbone is replaced by PPLCNet, additional use of EDA strategy and additional use of EDA strategy and SKL-UGI knowledge distillation strategy. +The following table lists the relevant indicators of the model. The first three lines means that using SwinTransformer_tiny, Res2Net200_vd_26w_4s and MobileNetV3_small_x0_35 as the backbone to training. The fourth to seventh lines means that the backbone is replaced by PPLCNet, additional use of EDA strategy and additional use of EDA strategy and SKL-UGI knowledge distillation strategy. | Backbone | Tpr(%) | Latency(ms) | Size(M)| Training Strategy | |-------|-----------|----------|---------------|---------------| -| SwinTranformer_tiny | 93.57 | 91.32 | 107 | using ImageNet pretrained model | +| SwinTranformer_tiny | 93.57 | 91.32 | 111 | using ImageNet pretrained model | | Res2Net200_vd_26w_4s | 98.92 | 80.99 | 284 | using ImageNet pretrained model | -| MobileNetV3_small_x0_35 | 84.83 | 2.85 | 1.6 | using ImageNet pretrained model | -| PPLCNet_x1_0 | 93.27 | 2.03 | 6.5 | using ImageNet pretrained model | -| PPLCNet_x1_0 | 98.16 | 2.03 | 6.5 | using SSLD pretrained model | -| PPLCNet_x1_0 | 99.30 | 2.03 | 6.5 | using SSLD pretrained model + EDA strategy | -| PPLCNet_x1_0 | 99.38 | 2.03 | 6.5 | using SSLD pretrained model + EDA strategy + SKL-UGI knowledge distillation strategy| +| MobileNetV3_small_x0_35 | 84.83 | 2.85 | 2.6 | using ImageNet pretrained model | +| PPLCNet_x1_0 | 93.27 | 2.03 | 7.1 | using ImageNet pretrained model | +| PPLCNet_x1_0 | 98.16 | 2.03 | 7.1 | using SSLD pretrained model | +| PPLCNet_x1_0 | 99.30 | 2.03 | 7.1 | using SSLD pretrained model + EDA strategy | +| PPLCNet_x1_0 | 99.38 | 2.03 | 7.1 | using SSLD pretrained model + EDA strategy + SKL-UGI knowledge distillation strategy| It can be seen that high Tpr can be getted when backbone is Res2Net200_vd_26w_4s, but the speed is slow. Replacing backbone with the lightweight model MobileNetV3_small_x0_35, the speed can be greatly improved, but the Tpr will be greatly reduced. Replacing backbone with faster backbone PPLCNet_x1_0, the Tpr is higher more 8.5 percentage points than MobileNetv3_small_x0_35. At the same time, the speed can be more than 20% faster. After additional using the SSLD pretrained model, the Tpr can be improved by about 4.9 percentage points without affecting the inference speed. Further, additional using the EDA strategy, the Tpr can be increased by 1.1 percentage points. Finally, after additional using the UDML knowledge distillation, the Tpr can be further improved by 2.2 percentage points. At this point, the Tpr is higher than that of Res2Net200_vd_26w_4s, but the speed is more than 70 times faster. The training method and deployment instructions of PULC will be introduced in detail below. diff --git a/docs/en/PULC/PULC_text_image_orientation_en.md b/docs/en/PULC/PULC_text_image_orientation_en.md index 2cddf79fa..1d3cc41f9 100644 --- a/docs/en/PULC/PULC_text_image_orientation_en.md +++ b/docs/en/PULC/PULC_text_image_orientation_en.md @@ -36,17 +36,17 @@ ## 1. Introduction -In the process of document scanning, license shooting and so on, sometimes in order to shoot more clearly, the camera device will be rotated, resulting in photo in different directions. At this time, the standard OCR process cannot cope with these issues well. Using the text image orientation classification technology, the direction of the text image can be predicted and adjusted, so as to improve the accuracy of OCR processing. This case provides a way for users to use PaddleClas PULC (Practical Ultra Lightweight Classification) to quickly build a lightweight, high-precision, practical classification model of text image orientation. This model can be widely used in OCR processing scenarios of rotating pictures in financial, government and other industries. +In the process of document scanning, license shooting and so on, sometimes in order to shoot more clearly, the camera device will be rotated, resulting in photo in different directions. At this time, the standard OCR process cannot cope with these issues well. Using the text image orientation classification technology, the direction of the text image can be predicted and adjusted, so as to improve the accuracy of OCR processing. This case provides a way for users to use PaddleClas PULC (Practical Ultra Lightweight image Classification) to quickly build a lightweight, high-precision, practical classification model of text image orientation. This model can be widely used in OCR processing scenarios of rotating pictures in financial, government and other industries. The following table lists the relevant indicators of the model. The first two lines means that using SwinTransformer_tiny and MobileNetV3_small_x0_35 as the backbone to training. The third to fifth lines means that the backbone is replaced by PPLCNet, additional use of SSLD pretrained model and additional use of hyperparameters searching strategy. | Backbone | Top1-Acc(%) | Latency(ms) | Size(M)| Training Strategy | | ----------------------- | --------- | ---------- | --------- | ------------------------------------- | -| SwinTranformer_tiny | 99.12 | 89.65 | 107 | using ImageNet pretrained model | -| MobileNetV3_small_x0_35 | 83.61 | 2.95 | 17 | using ImageNet pretrained model | -| PPLCNet_x1_0 | 97.85 | 2.16 | 6.5 | using ImageNet pretrained model | -| PPLCNet_x1_0 | 98.02 | 2.16 | 6.5 | using SSLD pretrained model | -| **PPLCNet_x1_0** | **99.06** | **2.16** | **6.5** | using SSLD pretrained model + hyperparameters searching strategy | +| SwinTranformer_tiny | 99.12 | 89.65 | 111 | using ImageNet pretrained model | +| MobileNetV3_small_x0_35 | 83.61 | 2.95 | 2.6 | using ImageNet pretrained model | +| PPLCNet_x1_0 | 97.85 | 2.16 | 7.1 | using ImageNet pretrained model | +| PPLCNet_x1_0 | 98.02 | 2.16 | 7.1 | using SSLD pretrained model | +| **PPLCNet_x1_0** | **99.06** | **2.16** | **7.1** | using SSLD pretrained model + hyperparameters searching strategy | It can be seen that high accuracy can be getted when backbone is SwinTranformer_tiny, but the speed is slow. Replacing backbone with the lightweight model MobileNetV3_small_x0_35, the speed can be greatly improved, but the accuracy will be greatly reduced. Replacing backbone with faster backbone PPLCNet_x1_0, the accuracy is higher more 14 percentage points than MobileNetv3_small_x0_35. At the same time, the speed can be more faster. After additional using the SSLD pretrained model, the accuracy can be improved by about 0.17 percentage points without affecting the inference speed. Finally, after additional using the hyperparameters searching strategy, the accuracy can be further improved by 1.04 percentage points. At this point, the accuracy is close to that of SwinTranformer_tiny, but the speed is more faster. The training method and deployment instructions of PULC will be introduced in detail below. diff --git a/docs/en/PULC/PULC_textline_orientation_en.md b/docs/en/PULC/PULC_textline_orientation_en.md index 71ea3407b..d11307d0b 100644 --- a/docs/en/PULC/PULC_textline_orientation_en.md +++ b/docs/en/PULC/PULC_textline_orientation_en.md @@ -38,19 +38,19 @@ ## 1. Introduction -This case provides a way for users to quickly build a lightweight, high-precision and practical classification model of textline orientation using PaddleClas PULC (Practical Ultra Lightweight Classification). The model can be widely used in character correction, character recognition, etc. +This case provides a way for users to quickly build a lightweight, high-precision and practical classification model of textline orientation using PaddleClas PULC (Practical Ultra Lightweight image Classification). The model can be widely used in character correction, character recognition, etc. The following table lists the relevant indicators of the model. The first two lines means that using SwinTransformer_tiny and MobileNetV3_small_x0_35 as the backbone to training. The third to seventh lines means that the backbone is replaced by PPLCNet, additional use of EDA strategy and additional use of EDA strategy and SKL-UGI knowledge distillation strategy. | Backbone | Top-1 Acc(%) | Latency(ms) | Size(M)| Training Strategy | |-------|-----------|----------|---------------|---------------| -| SwinTranformer_tiny | 93.61 | 89.64 | 107 | using ImageNet pretrained model | -| MobileNetV3_small_x0_35 | 81.40 | 2.96 | 17 | using ImageNet pretrained model | -| PPLCNet_x1_0 | 89.99 | 2.11 | 6.5 | using ImageNet pretrained model | -| PPLCNet_x1_0* | 94.06 | 2.68 | 6.5 | using ImageNet pretrained model | -| PPLCNet_x1_0* | 94.11 | 2.68 | 6.5 | using SSLD pretrained model | -| PPLCNet_x1_0** | 96.01 | 2.72 | 6.5 | using SSLD pretrained model + EDA strategy | -| PPLCNet_x1_0** | 95.86 | 2.72 | 6.5 | using SSLD pretrained model + EDA strategy + SKL-UGI knowledge distillation strategy| +| SwinTranformer_tiny | 93.61 | 89.64 | 111 | using ImageNet pretrained model | +| MobileNetV3_small_x0_35 | 81.40 | 2.96 | 2.6 | using ImageNet pretrained model | +| PPLCNet_x1_0 | 89.99 | 2.11 | 7.0 | using ImageNet pretrained model | +| PPLCNet_x1_0* | 94.06 | 2.68 | 7.0 | using ImageNet pretrained model | +| PPLCNet_x1_0* | 94.11 | 2.68 | 7.0 | using SSLD pretrained model | +| PPLCNet_x1_0** | 96.01 | 2.72 | 7.0 | using SSLD pretrained model + EDA strategy | +| PPLCNet_x1_0** | 95.86 | 2.72 | 7.0 | using SSLD pretrained model + EDA strategy + SKL-UGI knowledge distillation strategy| It can be seen that high accuracy can be getted when backbone is SwinTranformer_tiny, but the speed is slow. Replacing backbone with the lightweight model MobileNetV3_small_x0_35, the speed can be greatly improved, but the accuracy will be greatly reduced. Replacing backbone with faster backbone PPLCNet_x1_0, the accuracy is higher more 8.6 percentage points than MobileNetv3_small_x0_35. At the same time, the speed can be more than 10% faster. On this basis, by changing the resolution and stripe (refer to [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR)), the speed becomes 27% slower, but the accuracy can be improved by 4.5 percentage points. After additional using the SSLD pretrained model, the accuracy can be improved by about 0.05 percentage points without affecting the inference speed. Finally, additional using the EDA strategy, the accuracy can be increased by 1.9 percentage points. The training method and deployment instructions of PULC will be introduced in detail below. diff --git a/docs/en/PULC/PULC_traffic_sign_en.md b/docs/en/PULC/PULC_traffic_sign_en.md index 21abd4543..baa0faf48 100644 --- a/docs/en/PULC/PULC_traffic_sign_en.md +++ b/docs/en/PULC/PULC_traffic_sign_en.md @@ -38,7 +38,7 @@ ## 1. Introduction -This case provides a way for users to quickly build a lightweight, high-precision and practical classification model of traffic sign using PaddleClas PULC (Practical Ultra Lightweight Classification). The model can be widely used in automatic driving, road monitoring, etc. +This case provides a way for users to quickly build a lightweight, high-precision and practical classification model of traffic sign using PaddleClas PULC (Practical Ultra Lightweight image Classification). The model can be widely used in automatic driving, road monitoring, etc. The following table lists the relevant indicators of the model. The first two lines means that using SwinTransformer_tiny and MobileNetV3_small_x0_35 as the backbone to training. The third to sixth lines means that the backbone is replaced by PPLCNet, additional use of EDA strategy and additional use of EDA strategy and SKL-UGI knowledge distillation strategy. diff --git a/docs/en/PULC/PULC_vehicle_attribute_en.md b/docs/en/PULC/PULC_vehicle_attribute_en.md index 822e966ee..47d7c963e 100644 --- a/docs/en/PULC/PULC_vehicle_attribute_en.md +++ b/docs/en/PULC/PULC_vehicle_attribute_en.md @@ -38,13 +38,12 @@ ## 1. Introduction -This case provides a way for users to quickly build a lightweight, high-precision and practical classification model of vehicle attribute using PaddleClas PULC (Practical Ultra Lightweight image Classification). The model can be widely used in -Vehicle identification, road monitoring and other scenarios. +This case provides a way for users to quickly build a lightweight, high-precision and practical classification model of vehicle attribute using PaddleClas PULC (Practical Ultra Lightweight image Classification). The model can be widely used in Vehicle identification, road monitoring and other scenarios. The following table lists the relevant indicators of the model. The first three lines means that using Res2Net200_vd_26w_4s, ResNet50 and MobileNetV3_small_x0_35 as the backbone to training. The fourth to seventh lines means that the backbone is replaced by PPLCNet, additional use of EDA strategy and additional use of EDA strategy and SKL-UGI knowledge distillation strategy. - - -| Backbone | ma(%) | Latency(ms) | Size(M) | Training Strategy | + + +| Backbone | mA(%) | Latency(ms) | Size(M) | Training Strategy | |-------|-----------|----------|---------------|---------------| | Res2Net200_vd_26w_4s | 91.36 | 79.46 | 293 | using ImageNet pretrained | | ResNet50 | 89.98 | 12.83 | 92 | using ImageNet pretrained | @@ -52,11 +51,11 @@ The following table lists the relevant indicators of the model. The first three | PPLCNet_x1_0 | 89.57 | 2.36 | 7.2 | using ImageNet pretrained | | PPLCNet_x1_0 | 90.07 | 2.36 | 7.2 | using SSLD pretrained | | PPLCNet_x1_0 | 90.59 | 2.36 | 7.2 | using SSLD pretrained + EDA strategy| -| PPLCNet_x1_0 | 90.81 | 2.36 | 8.2 | using SSLD pretrained + EDA strategy + SKL-UGI knowledge distillation strategy| - +| PPLCNet_x1_0 | 90.81 | 2.36 | 7.2 | using SSLD pretrained + EDA strategy + SKL-UGI knowledge distillation strategy| + It can be seen from the table that the ma metric is higher when the backbone is Res2Net200_vd_26w_4s, but the inference speed is slower. After replacing the backbone with the lightweight model MobileNetV3_small_x0_35, the speed can be greatly improved, but the ma metric drops significantly. When the backbone is replaced by PPLCNet_x1_0, the ma metric is increased by 2 percentage points, and the speed is also increased by about 23%. On this basis, after using the SSLD pre-training model, the ma metric can be improved by about 0.5 percentage points without changing the inference speed. Further, when the EDA strategy is integrated, the ma metric can be improved by another 0.52 percentage points. Finally, using After SKL-UGI knowledge distillation, the ma metric can continue to improve by 0.23 percentage points. At this time, the ma metric of PPLCNet_x1_0 is only 0.55 percentage points away from Res2Net200_vd_26w_4s, but it is 32 times faster. The training method and deployment instructions of PULC will be introduced in detail below. - + **Note**: @@ -163,16 +162,16 @@ Part of the data visualization is shown below.
- + First, apply for and download data from [VeRi dataset official website](https://www.v7labs.com/open-datasets/veri-dataset), put it in the `dataset` directory of PaddleClas, the dataset directory name is `VeRi `, use the following command to enter the folder. - + ```shell cd PaddleClas/dataset/VeRi/ ``` - + Then use the following code to convert the label (you can execute the following command in the python terminal, or you can write it to a file and run the file using `python3 convert.py`). - + ```python import os from xml.dom.minidom import parse @@ -209,10 +208,10 @@ def convert_annotation(input_fp, output_fp): convert_annotation('train_label.xml', 'train_list.txt') #imagename vehiclenum colorid typeid convert_annotation('test_label.xml', 'test_list.txt') ``` - - + + After executing the above command, the `VeRi` directory has the following data: - + ``` VeRi ├── image_train @@ -231,7 +230,7 @@ VeRi ├── train_label.xml ├── test_label.xml ``` - + where `train/` and `test/` are the training set and validation set, respectively. `train_list.txt` and `test_list.txt` are the converted label files for training and validation sets, respectively. @@ -427,7 +426,7 @@ python3.7 python/predict_cls.py -c configs/PULC/vehicle_attribute/inference_vehi The prediction results: ``` -0002_c002_00030670_0.jpg: {'attributes': 'Color: (yellow, prob: 0.9893478155136108), Type: (hatchback, prob: 0.9734099507331848)', 'output': [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0]} +0002_c002_00030670_0.jpg: {'attributes': 'Color: (yellow, prob: 0.9893478155136108), Type: (hatchback, prob: 0.9734099507331848)', 'output': [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0]} ``` @@ -445,8 +444,8 @@ python3.7 python/predict_cls.py -c configs/PULC/vehicle_attribute/inference_vehi All prediction results will be printed, as shown below. ``` -0002_c002_00030670_0.jpg: {'attributes': 'Color: (yellow, prob: 0.9893476963043213), Type: (hatchback, prob: 0.9734097719192505)', 'output': [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0]} -0014_c012_00040750_0.jpg: {'attributes': 'Color: (red, prob: 0.999872088432312), Type: (sedan, prob: 0.999976634979248)', 'output': [0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0]} +0002_c002_00030670_0.jpg: {'attributes': 'Color: (yellow, prob: 0.9893476963043213), Type: (hatchback, prob: 0.9734097719192505)', 'output': [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0]} +0014_c012_00040750_0.jpg: {'attributes': 'Color: (red, prob: 0.999872088432312), Type: (sedan, prob: 0.999976634979248)', 'output': [0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0]} ``` Among the prediction results above, `someone` means that there is a human in the image, `nobody` means that there is no human in the image.