- [3. Image recognition service deployment](#3-image-recognition-service-deployment)
- [3.1 Model conversion](#31-model-conversion)
- [3.2 Service deployment and request](#32-service-deployment-and-request)
- [3.2.1 Python Serving](#321-python-serving)
- [3.2.2 C++ Serving](#322-c-serving)
- [4. FAQ](#4-faq)
<aname="1"></a>
## 1 Introduction
[Paddle Serving](https://github.com/PaddlePaddle/Serving) aims to help deep learning developers easily deploy online prediction services, support one-click deployment of industrial-grade service capabilities, high concurrency between client and server Efficient communication and support for developing clients in multiple programming languages.
This section takes the HTTP prediction service deployment as an example to introduce how to use PaddleServing to deploy the model service in PaddleClas. Currently, only Linux platform deployment is supported, and Windows platform is not currently supported.
<aname="2"></a>
## 2. Serving installation
The Serving official website recommends using docker to install and deploy the Serving environment. First, you need to pull the docker environment and create a Serving-based docker.
* If the installation speed is too slow, you can change the source through `-i https://pypi.tuna.tsinghua.edu.cn/simple` to speed up the installation process.
* For other environment configuration installation, please refer to: [Install Paddle Serving with Docker](https://github.com/PaddlePaddle/Serving/blob/v0.7.0/doc/Install_CN.md)
<aname="3"></a>
## 3. Image recognition service deployment
When using PaddleServing for image recognition service deployment, **need to convert multiple saved inference models to Serving models**. The following takes the ultra-lightweight image recognition model in PP-ShiTu as an example to introduce the deployment of image recognition services.
<aname="3.1"></a>
### 3.1 Model conversion
- Go to the working directory:
```shell
cd deploy/
```
- Download generic detection inference model and generic recognition inference model
```shell
# Create and enter the models folder
mkdir models
cd models
# Download and unzip the generic recognition model
The meaning of the parameters of the above command is the same as [#3.1 Model conversion](#3.1)
After the recognition inference model is converted, there will be additional folders `general_PPLCNet_x2_5_lite_v1.0_serving/` and `general_PPLCNet_x2_5_lite_v1.0_client/` in the current folder. Modify the name of `alias` in `serving_server_conf.prototxt` in `general_PPLCNet_x2_5_lite_v1.0_serving/` and `general_PPLCNet_x2_5_lite_v1.0_client/` directories respectively: Change `alias_name` in `fetch_var` to `features`. The content of the modified `serving_server_conf.prototxt` is as follows
```log
feed_var {
name: "x"
alias_name: "x"
is_lod_tensor: false
feed_type: 1
shape: 3
shape: 224
shape: 224
}
fetch_var {
name: "save_infer_model/scale_0.tmp_1"
alias_name: "features"
is_lod_tensor: false
fetch_type: 1
shape: 512
}
```
After the conversion of the general recognition inference model is completed, there will be additional `general_PPLCNet_x2_5_lite_v1.0_serving/` and `general_PPLCNet_x2_5_lite_v1.0_client/` folders in the current folder, with the following structure:
```shell
├── general_PPLCNet_x2_5_lite_v1.0_serving/
│ ├── inference.pdiparams
│ ├── inference.pdmodel
│ ├── serving_server_conf.prototxt
│ └── serving_server_conf.stream.prototxt
│
└── general_PPLCNet_x2_5_lite_v1.0_client/
├── serving_client_conf.prototxt
└── serving_client_conf.stream.prototxt
```
- Convert general detection inference model to Serving model:
The meaning of the parameters of the above command is the same as [#3.1 Model conversion](#3.1)
After the conversion of the general detection inference model is completed, there will be additional folders `picodet_PPLCNet_x2_5_mainbody_lite_v1.0_serving/` and `picodet_PPLCNet_x2_5_mainbody_lite_v1.0_client/` in the current folder, with the following structure:
| `dirname` | str | - | The storage path of the model file to be converted. The program structure file and parameter file are saved in this directory.|
| `model_filename` | str | None | The name of the file storing the model Inference Program structure that needs to be converted. If set to None, use `__model__` as the default filename |
| `params_filename` | str | None | The name of the file that stores all parameters of the model that need to be transformed. It needs to be specified if and only if all model parameters are stored in a single binary file. If the model parameters are stored in separate files, set it to None |
| `serving_server` | str | `"serving_server"` | The storage path of the converted model files and configuration files. Default is serving_server |
| `serving_client` | str | `"serving_client"` | The converted client configuration file storage path. Default is |
- Download and unzip the index of the retrieval library that has been built
**Note:** The identification service involves multiple models, and the PipeLine deployment method is used for performance reasons. The Pipeline deployment method currently does not support the windows platform.
- go to the working directory
```shell
cd ./deploy/paddleserving/recognition
```
The paddleserving directory contains code to start the Python Pipeline service, the C++ Serving service, and send prediction requests, including:
```shell
__init__.py
config.yml # The configuration file to start the python pipeline service
pipeline_http_client.py # Script for sending pipeline prediction requests in http mode
pipeline_rpc_client.py # Script for sending pipeline prediction requests in rpc mode
recognition_web_service.py # Script to start the pipeline server
Different from Python Serving, the C++ Serving client calls C++ OP to predict, so before starting the service, you need to compile and install the serving server package, and set `SERVING_BIN`.
- Compile and install the Serving server package
```shell
# Enter the working directory
cd PaddleClas/deploy/paddleserving
# One-click compile and install Serving server, set SERVING_BIN
**Note:** The path set by [build_server.sh](../../../deploy/paddleserving/build_server.sh#L55-L62) may need to be modified according to the actual machine environment such as CUDA, python version, etc., and then compiled; If you encounter a non-network error during the execution of `build_server.sh`, you can manually copy the commands in the script to the terminal for execution.
- The input and output format used by C++ Serving is different from that of Python, so you need to execute the following command to overwrite the files below [3.1] (#31-model conversion) by copying the 4 files to get the corresponding 4 prototxt files in the folder.
If the service program is running in the foreground, you can press `Ctrl+C` to terminate the server program; if it is running in the background, you can use the kill command to close related processes, or you can execute the following command in the path where the service program is started to terminate the server program:
```bash
python3.7 -m paddle_serving_server.serve stop
```
After the execution is completed, the `Process stopped` message appears, indicating that the service was successfully shut down.
<aname="4"></a>
## 4. FAQ
**Q1**: No result is returned after the request is sent or an output decoding error is prompted
**A1**: Do not set the proxy when starting the service and sending the request. You can close the proxy before starting the service and sending the request. The command to close the proxy is:
```shell
unset https_proxy
unset http_proxy
```
**Q2**: nothing happens after starting the service
**A2**: You can check whether the path corresponding to `model_config` in `config.yml` exists, and whether the folder name is correct
For more service deployment types, such as `RPC prediction service`, you can refer to Serving's [github official website](https://github.com/PaddlePaddle/Serving/tree/v0.9.0/examples)