PaddleClas/docs/en/extension/paddle_serving_en.md

# Model Service Deployment

## Overview
[Paddle Serving](https://github.com/PaddlePaddle/Serving) aims to help deep-learning researchers to easily deploy online inference services, supporting one-click deployment of industry, high concurrency and efficient communication between client and server and supporting multiple programming languages to develop clients.

Taking HTTP inference service deployment as an example to introduce how to use PaddleServing to deploy model services in PaddleClas.

## Serving Install

It is recommends to use docker to install and deploy the Serving environment in the Serving official website, first, you need to pull the docker environment and create Serving-based docker.

```shell
nvidia-docker pull hub.baidubce.com/paddlepaddle/serving:0.2.0-gpu
nvidia-docker run -p 9292:9292 --name test -dit hub.baidubce.com/paddlepaddle/serving:0.2.0-gpu
nvidia-docker exec -it test bash
```

In docker, you need to install some packages about Serving

```shell
pip install paddlepaddle-gpu
pip install paddle-serving-client
pip install paddle-serving-server-gpu
```

* If the installation speed is too slow, you can add `-i https://pypi.tuna.tsinghua.edu.cn/simple` following pip to speed up the process.

* If you want to deploy CPU service, you can install the cpu version of Serving, the command is as follow.

```shell
pip install paddle-serving-server
```

### Export Model

Exporting the Serving model using `tools/export_serving_model.py`, taking ResNet50_vd as an example, the command is as follow.

```shell
python tools/export_serving_model.py -m ResNet50_vd -p ./pretrained/ResNet50_vd_pretrained/ -o serving
```

finally, the client configures, model parameters and structure file will be saved in `ppcls_client_conf` and `ppcls_model`.


### Service Deployment and Request

* Using the following commands to start the Serving.

```shell
python tools/serving/image_service_gpu.py serving/ppcls_model workdir 9292
```

`serving/ppcls_model` is the address of the Serving model just saved, `workdir` is the work directory, and `9292` is the port of the service.


* Using the following script to send an identification request to the Serving and return the result.

```
python tools/serving/image_http_client.py  9292 ./docs/images/logo.png
```

`9292` is the port for sending the request, which is consistent with the Serving starting port, and `./docs/images/logo.png` is the test image, the final top1 label and probability are returned.

* For more Serving deployment, such RPC inference service, you can refer to the Serving official website: [https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/imagenet](https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/imagenet)
polish dygraph doc and deploy 2020-10-19 13:05:34 +08:00			`# Model Service Deployment`

			`## Overview`
			`[Paddle Serving](https://github.com/PaddlePaddle/Serving) aims to help deep-learning researchers to easily deploy online inference services, supporting one-click deployment of industry, high concurrency and efficient communication between client and server and supporting multiple programming languages to develop clients.`

			`Taking HTTP inference service deployment as an example to introduce how to use PaddleServing to deploy model services in PaddleClas.`

			`## Serving Install`

			`It is recommends to use docker to install and deploy the Serving environment in the Serving official website, first, you need to pull the docker environment and create Serving-based docker.`

			```shell
			`nvidia-docker pull hub.baidubce.com/paddlepaddle/serving:0.2.0-gpu`
			`nvidia-docker run -p 9292:9292 --name test -dit hub.baidubce.com/paddlepaddle/serving:0.2.0-gpu`
			`nvidia-docker exec -it test bash`
			```

			`In docker, you need to install some packages about Serving`

			```shell
			`pip install paddlepaddle-gpu`
			`pip install paddle-serving-client`
			`pip install paddle-serving-server-gpu`
			```

			* If the installation speed is too slow, you can add `-i https://pypi.tuna.tsinghua.edu.cn/simple` following pip to speed up the process.

			`* If you want to deploy CPU service, you can install the cpu version of Serving, the command is as follow.`

			```shell
			`pip install paddle-serving-server`
			```

			`### Export Model`

			Exporting the Serving model using `tools/export_serving_model.py`, taking ResNet50_vd as an example, the command is as follow.

			```shell
			`python tools/export_serving_model.py -m ResNet50_vd -p ./pretrained/ResNet50_vd_pretrained/ -o serving`
			```

			finally, the client configures, model parameters and structure file will be saved in `ppcls_client_conf` and `ppcls_model`.


			`### Service Deployment and Request`

			`* Using the following commands to start the Serving.`

			```shell
			`python tools/serving/image_service_gpu.py serving/ppcls_model workdir 9292`
			```

			`serving/ppcls_model` is the address of the Serving model just saved, `workdir` is the work directory, and `9292` is the port of the service.


			`* Using the following script to send an identification request to the Serving and return the result.`

			```
			`python tools/serving/image_http_client.py 9292 ./docs/images/logo.png`
			```

			`9292` is the port for sending the request, which is consistent with the Serving starting port, and `./docs/images/logo.png` is the test image, the final top1 label and probability are returned.

			`* For more Serving deployment, such RPC inference service, you can refer to the Serving official website: [https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/imagenet](https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/imagenet)`