[Paddle Serving](https://github.com/PaddlePaddle/Serving) aims to help deep learning developers easily deploy online prediction services, support one-click deployment of industrial-grade service capabilities, high concurrency between client and server Efficient communication and support for developing clients in multiple programming languages.
This section takes the HTTP prediction service deployment as an example to introduce how to use PaddleServing to deploy the model service in PaddleClas. Currently, only Linux platform deployment is supported, and Windows platform is not currently supported.
The Serving official website recommends using docker to install and deploy the Serving environment. First, you need to pull the docker environment and create a Serving-based docker.
* If the installation speed is too slow, you can change the source through `-i https://pypi.tuna.tsinghua.edu.cn/simple` to speed up the installation process.
* For other environment configuration installation, please refer to: [Install Paddle Serving with Docker](https://github.com/PaddlePaddle/Serving/blob/v0.7.0/doc/Install_CN.md)
| `dirname` | str | - | The storage path of the model file to be converted. The program structure file and parameter file are saved in this directory. |
| `model_filename` | str | None | The name of the file storing the model Inference Program structure that needs to be converted. If set to None, use `__model__` as the default filename |
| `params_filename` | str | None | File name where all parameters of the model to be converted are stored. It needs to be specified if and only if all model parameters are stored in a single binary file. If the model parameters are stored in separate files, set it to None |
| `serving_server` | str | `"serving_server"` | The storage path of the converted model files and configuration files. Default is serving_server |
| `serving_client` | str | `"serving_client"` | The converted client configuration file storage path. Default is serving_client |
After the ResNet50_vd inference model is converted, there will be additional `ResNet50_vd_serving` and `ResNet50_vd_client` folders in the current folder, with the following structure:
```shell
├── ResNet50_vd_serving/
│ ├── inference.pdiparams
│ ├── inference.pdmodel
│ ├── serving_server_conf.prototxt
│ └── serving_server_conf.stream.prototxt
│
└── ResNet50_vd_client/
├── serving_client_conf.prototxt
└── serving_client_conf.stream.prototxt
```
- Serving provides the function of input and output renaming in order to be compatible with the deployment of different models. When different models are deployed in inference, you only need to modify the `alias_name` of the configuration file, and the inference deployment can be completed without modifying the code. Therefore, after the conversion, you need to modify the alias names in the files `serving_server_conf.prototxt` under `ResNet50_vd_serving` and `ResNet50_vd_client` respectively, and change the `alias_name` in `fetch_var` to `prediction`, the modified serving_server_conf.prototxt is as follows Show:
In addition to the single-model deployment method introduced in [Chapter 3 Service Deployment for Image Classification](#3), we will introduce how to use multiple models to complete the multi-model **image recognition service deployment**
When using PaddleServing for image recognition service deployment, **need to convert multiple saved inference models to Serving models**. The following takes the ultra-lightweight image recognition model in PP-ShiTu as an example to introduce the deployment of image recognition services.
The meaning of the parameters of the above command is the same as [#3.1 Model Transformation](#3.1)
After the transformation of the general recognition inference model is completed, there will be additional `general_PPLCNet_x2_5_lite_v1.0_serving/` and `general_PPLCNet_x2_5_lite_v1.0_client/` folders in the current folder, with the following structure:
- Modify the alias names in `serving_server_conf.prototxt` in `general_PPLCNet_x2_5_lite_v1.0_serving/` and `general_PPLCNet_x2_5_lite_v1.0_client/` directories respectively: change `alias_name` in `fetch_var` to `features`.
The modified `serving_server_conf.prototxt` content is as follows:
```shell
feed_var {
name: "x"
alias_name: "x"
is_lod_tensor: false
feed_type: 1
shape: 3
shape: 224
shape: 224
}
fetch_var {
name: "save_infer_model/scale_0.tmp_1"
alias_name: "features"
is_lod_tensor: false
fetch_type: 1
shape: 512
}
```
- Convert general detection inference model to Serving model:
After the general detection inference model transformation is completed, there will be additional folders `picodet_PPLCNet_x2_5_mainbody_lite_v1.0_serving/` and `picodet_PPLCNet_x2_5_mainbody_lite_v1.0_client/` in the current folder, with the following structure:
**Note:** There is no need to modify the alias name in serving_server_conf.prototxt under `picodet_PPLCNet_x2_5_mainbody_lite_v1.0_serving/` directory.
- Download and unzip the built retrieval library index
**Note:** The identification service involves multiple models, and the PipeLine deployment method is used for performance reasons. The Pipeline deployment method currently does not support the windows platform.
- go to the working directory
```shell
cd ./deploy/paddleserving/recognition
```
The paddleserving directory contains code to start the Python Pipeline service, the C++ Serving service, and send prediction requests, including:
```shell
__init__.py
config.yml # The configuration file to start the python pipeline service
pipeline_http_client.py # Script for sending pipeline prediction requests in http mode
pipeline_rpc_client.py # Script for sending pipeline prediction requests in rpc mode
recognition_web_service.py # Script to start the pipeline server
run_cpp_serving.sh # Script to start C++ Pipeline Serving deployment
test_cpp_serving_client.py # Script for sending C++ Pipeline serving prediction requests by rpc
```
<aname="4.2.1"></a>
#### 4.2.1 Python Serving
- Start the service:
```shell
# Start the service, run the logSave in log.txt
python3.7 recognition_web_service.py &>log.txt &
```
- send request:
```shell
python3.7 pipeline_http_client.py
```
After a successful run, the results of the model prediction will be printed in the cmd window, and the results are as follows:
# Start the service: Here, the subject detection and feature extraction services will be started at the same time in the background, and the port numbers are 9293 and 9294 respectively;
# The running logs are saved in log_mainbody_detection.txt and log_feature_extraction.txt respectively
sh run_cpp_serving.sh
```
- send request:
```shell
# send service request
python3.7 test_cpp_serving_client.py
```
After a successful run, the results of the model predictions are printed in the cmd window, and the results are as follows:
**A1**: Do not set the proxy when starting the service and sending the request. You can close the proxy before starting the service and sending the request. The command to close the proxy is:
**Q2**: nothing happens after starting the service
**A2**: You can check whether the path corresponding to `model_config` in `config.yml` exists, and whether the folder name is correct
For more service deployment types, such as `RPC prediction service`, you can refer to Serving's [github official website](https://github.com/PaddlePaddle/Serving/tree/v0.9.0/examples)