10 KiB
English | 简体中文
Service deployment based on PaddleHub Serving
PaddleClas supports rapid service deployment through PaddleHub. Currently, the deployment of image classification is supported. Please look forward to the deployment of image recognition.
Catalogue
- 1 Introduction
- 2. Prepare the environment
- 3. Download the inference model
- 4. Install the service module
- 5. Start service
- 6. Send prediction requests
- 7. User defined service module modification
1 Introduction
The hubserving service deployment configuration service package clas
contains 3 required files, the directories are as follows:
deploy/hubserving/clas/
├── __init__.py # Empty file, required
├── config.json # Configuration file, optional, passed in as a parameter when starting the service with configuration
├── module.py # The main module, required, contains the complete logic of the service
└── params.py # Parameter file, required, including model path, pre- and post-processing parameters and other parameters
2. Prepare the environment
# Install paddlehub, version 2.1.0 is recommended
python3.7 -m pip install paddlehub==2.1.0 --upgrade -i https://pypi.tuna.tsinghua.edu.cn/simple
3. Download the inference model
Before installing the service module, you need to prepare the inference model and put it in the correct path. The default model path is:
- Classification inference model structure file:
PaddleClas/inference/inference.pdmodel
- Classification inference model weight file:
PaddleClas/inference/inference.pdiparams
Notice:
-
Model file paths can be viewed and modified in
PaddleClas/deploy/hubserving/clas/params.py
:"inference_model_dir": "../inference/"
-
Model files (including
.pdmodel
and.pdiparams
) must be namedinference
. -
We provide a large number of pre-trained models based on the ImageNet-1k dataset. For the model list and download address, see Model Library Overview, or you can use your own trained and converted models.
4. Install the service module
-
In the Linux environment, the installation example is as follows:
cd PaddleClas/deploy # Install the service module: hub install hubserving/clas/
-
In the Windows environment (the folder separator is
\
), the installation example is as follows:cd PaddleClas\deploy # Install the service module: hub install hubserving\clas\
5. Start service
5.1 Start with command line parameters
This method only supports prediction using CPU. Start command:
hub serving start \
--modules clas_system
--port 8866
This completes the deployment of a serviced API, using the default port number 8866.
Parameter Description:
parameters | uses |
---|---|
--modules/-m | [required] PaddleHub Serving pre-installed model, listed in the form of multiple Module==Version key-value pairsWhen no Version is specified, the latest is selected by default version |
--port/-p | [OPTIONAL] Service port, default is 8866 |
--use_multiprocess | [Optional] Whether to enable the concurrent mode, the default is single-process mode, it is recommended to use this mode for multi-core CPU machinesWindows operating system only supports single-process mode |
--workers | [Optional] The number of concurrent tasks specified in concurrent mode, the default is 2*cpu_count-1 , where cpu_count is the number of CPU cores |
For more deployment details, see PaddleHub Serving Model One-Click Service Deployment |
5.2 Start with configuration file
This method only supports prediction using CPU or GPU. Start command:
hub serving start -c config.json
Among them, the format of config.json
is as follows:
{
"modules_info": {
"clas_system": {
"init_args": {
"version": "1.0.0",
"use_gpu": true,
"enable_mkldnn": false
},
"predict_args": {
}
}
},
"port": 8866,
"use_multiprocess": false,
"workers": 2
}
Parameter Description:
- The configurable parameters in
init_args
are consistent with the_initialize
function interface inmodule.py
. in,- When
use_gpu
istrue
, it means to use GPU to start the service. - When
enable_mkldnn
istrue
, it means to use MKL-DNN acceleration.
- When
- The configurable parameters in
predict_args
are consistent with thepredict
function interface inmodule.py
.
Notice:
- When using the configuration file to start the service, the parameter settings in the configuration file will be used, and other command line parameters will be ignored;
- If you use GPU prediction (ie,
use_gpu
is set totrue
), you need to set theCUDA_VISIBLE_DEVICES
environment variable to specify the GPU card number used before starting the service, such as:export CUDA_VISIBLE_DEVICES=0
; use_gpu
cannot betrue
at the same time asuse_multiprocess
;- ** When both
use_gpu
andenable_mkldnn
aretrue
,enable_mkldnn
will be ignored and GPU** will be used.
If you use GPU No. 3 card to start the service:
cd PaddleClas/deploy
export CUDA_VISIBLE_DEVICES=3
hub serving start -c hubserving/clas/config.json
6. Send prediction requests
After configuring the server, you can use the following command to send a prediction request to get the prediction result:
cd PaddleClas/deploy
python3.7 hubserving/test_hubserving.py \
--server_url http://127.0.0.1:8866/predict/clas_system \
--image_file ./hubserving/ILSVRC2012_val_00006666.JPEG \
--batch_size 8
Predicted output
The result(s): class_ids: [57, 67, 68, 58, 65], label_names: ['garter snake, grass snake', 'diamondback, diamondback rattlesnake, Crotalus adamanteus', 'sidewinder, horned rattlesnake, Crotalus cerastes' , 'water snake', 'sea snake'], scores: [0.21915, 0.15631, 0.14794, 0.13177, 0.12285]
The average time of prediction cost: 2.970 s/image
The average time cost: 3.014 s/image
The average top-1 score: 0.110
Script parameter description:
- server_url: Service address, the format is
http://[ip_address]:[port]/predict/[module_name]
. - image_path: The test image path, which can be a single image path or an image collection directory path.
- batch_size: [OPTIONAL] Make predictions in
batch_size
size, default is1
. - resize_short: [optional] When preprocessing, resize by short edge, default is
256
. - crop_size: [Optional] The size of the center crop during preprocessing, the default is
224
. - normalize: [Optional] Whether to perform
normalize
during preprocessing, the default isTrue
. - to_chw: [Optional] Whether to adjust to
CHW
order when preprocessing, the default isTrue
.
Note: If you use Transformer
series models, such as DeiT_***_384
, ViT_***_384
, etc., please pay attention to the input data size of the model, you need to specify --resize_short=384 -- crop_size=384
.
Return result format description: The returned result is a list (list), including the top-k classification results, the corresponding scores, and the time-consuming prediction of this image, as follows:
list: return result
└──list: first image result
├── list: the top k classification results, sorted in descending order of score
├── list: the scores corresponding to the first k classification results, sorted in descending order of score
└── float: The image classification time, in seconds
7. User defined service module modification
If you need to modify the service logic, you need to do the following:
-
Stop the service
hub serving stop --port/-p XXXX
-
Go to the corresponding
module.py
andparams.py
and other files to modify the code according to actual needs.module.py
needs to be reinstalled after modification (hub install hubserving/clas/
) and deployed. Before deploying, you can use thepython3.7 hubserving/clas/module.py
command to quickly test the code ready for deployment. -
Uninstall the old service pack
hub uninstall clas_system
-
Install the new modified service pack
hub install hubserving/clas/
-
Restart the service
hub serving start -m clas_system
Notice:
Common parameters can be modified in PaddleClas/deploy/hubserving/clas/params.py
:
- To replace the model, you need to modify the model file path parameters:
"inference_model_dir":
- Change the number of
top-k
results returned when postprocessing:'topk':
- The mapping file corresponding to the lable and class id when changing the post-processing:
'class_id_map_file':
In order to avoid unnecessary delay and be able to predict with batch_size, data preprocessing logic (including resize
, crop
and other operations) is completed on the client side, so it needs to modify data preprocessing logic related code in PaddleClas/deploy/hubserving/test_hubserving.py# L41-L47 and PaddleClas/deploy/hubserving/test_hubserving.py#L51-L76.