mirror of
https://github.com/JDAI-CV/fast-reid.git
synced 2025-06-03 14:50:47 +08:00
Summary: Remove tiny-tensorrt dependency and rewrite a new tensorrt inference api. In the new version of trt infer, it can pad the input to fixed batch automatically, so you don't need to worry about dynamic batch size.
146 lines
5.1 KiB
Markdown
146 lines
5.1 KiB
Markdown
# Model Deployment
|
|
|
|
This directory contains:
|
|
|
|
1. The scripts that convert a fastreid model to Caffe/ONNX/TRT format.
|
|
|
|
2. The exmpales that load a R50 baseline model in Caffe/ONNX/TRT and run inference.
|
|
|
|
## Tutorial
|
|
|
|
### Caffe Convert
|
|
|
|
<details>
|
|
<summary>step-to-step pipeline for caffe convert</summary>
|
|
|
|
This is a tiny example for converting fastreid-baseline in `meta_arch` to Caffe model, if you want to convert more complex architecture, you need to customize more things.
|
|
|
|
1. Run `caffe_export.py` to get the converted Caffe model,
|
|
|
|
```bash
|
|
python caffe_export.py --config-file root-path/market1501/bagtricks_R50/config.yml --name baseline_R50 --output outputs/caffe_model --opts MODEL.WEIGHTS root-path/logs/market1501/bagtricks_R50/model_final.pth
|
|
```
|
|
|
|
then you can check the Caffe model and prototxt in `outputs/caffe_model`.
|
|
|
|
2. Change `prototxt` following next two steps:
|
|
|
|
1) Modify `avg_pooling` in `baseline_R50.prototxt`
|
|
|
|
```prototxt
|
|
layer {
|
|
name: "avgpool1"
|
|
type: "Pooling"
|
|
bottom: "relu_blob49"
|
|
top: "avgpool_blob1"
|
|
pooling_param {
|
|
pool: AVE
|
|
global_pooling: true
|
|
}
|
|
}
|
|
```
|
|
|
|
2) Change the last layer `top` name to `output`
|
|
|
|
```prototxt
|
|
layer {
|
|
name: "bn_scale54"
|
|
type: "Scale"
|
|
bottom: "batch_norm_blob54"
|
|
top: "output" # bn_norm_blob54
|
|
scale_param {
|
|
bias_term: true
|
|
}
|
|
}
|
|
```
|
|
|
|
3. (optional) You can open [Netscope](https://ethereon.github.io/netscope/quickstart.html), then enter you network `prototxt` to visualize the network.
|
|
|
|
4. Run `caffe_inference.py` to save Caffe model features with input images
|
|
|
|
```bash
|
|
python caffe_inference.py --model-def outputs/caffe_model/baseline_R50.prototxt \
|
|
--model-weights outputs/caffe_model/baseline_R50.caffemodel \
|
|
--input test_data/*.jpg --output caffe_output
|
|
```
|
|
|
|
6. Run `demo/demo.py` to get fastreid model features with the same input images, then verify that Caffe and PyTorch are computing the same value for the network.
|
|
|
|
```python
|
|
np.testing.assert_allclose(torch_out, ort_out, rtol=1e-3, atol=1e-6)
|
|
```
|
|
|
|
</details>
|
|
|
|
### ONNX Convert
|
|
|
|
<details>
|
|
<summary>step-to-step pipeline for onnx convert</summary>
|
|
|
|
This is a tiny example for converting fastreid-baseline in `meta_arch` to ONNX model. ONNX supports most operators in pytorch as far as I know and if some operators are not supported by ONNX, you need to customize these.
|
|
|
|
1. Run `onnx_export.py` to get the converted ONNX model,
|
|
|
|
```bash
|
|
python onnx_export.py --config-file root-path/bagtricks_R50/config.yml --name baseline_R50 --output outputs/onnx_model --opts MODEL.WEIGHTS root-path/logs/market1501/bagtricks_R50/model_final.pth
|
|
```
|
|
|
|
then you can check the ONNX model in `outputs/onnx_model`.
|
|
|
|
2. (optional) You can use [Netron](https://github.com/lutzroeder/netron) to visualize the network.
|
|
|
|
3. Run `onnx_inference.py` to save ONNX model features with input images
|
|
|
|
```bash
|
|
python onnx_inference.py --model-path outputs/onnx_model/baseline_R50.onnx \
|
|
--input test_data/*.jpg --output onnx_output
|
|
```
|
|
|
|
4. Run `demo/demo.py` to get fastreid model features with the same input images, then verify that ONNX Runtime and PyTorch are computing the same value for the network.
|
|
|
|
```python
|
|
np.testing.assert_allclose(torch_out, ort_out, rtol=1e-3, atol=1e-6)
|
|
```
|
|
|
|
</details>
|
|
|
|
### TensorRT Convert
|
|
|
|
<details>
|
|
<summary>step-to-step pipeline for trt convert</summary>
|
|
|
|
This is a tiny example for converting fastreid-baseline in `meta_arch` to TRT model.
|
|
|
|
First you need to convert the pytorch model to ONNX format following [ONNX Convert](https://github.com/JDAI-CV/fast-reid#fastreid), and you need to remember your `output` name. Then you can convert ONNX model to TensorRT following instructions below.
|
|
|
|
1. Run command line below to get the converted TRT model from ONNX model,
|
|
|
|
```bash
|
|
python trt_export.py --name baseline_R50 --output outputs/trt_model \
|
|
--mode fp32 --batch-size 8 --height 256 --width 128 \
|
|
--onnx-model outputs/onnx_model/baseline.onnx
|
|
```
|
|
|
|
then you can check the TRT model in `outputs/trt_model`.
|
|
|
|
2. Run `trt_inference.py` to save TRT model features with input images
|
|
|
|
```bash
|
|
python3 trt_inference.py --model-path outputs/trt_model/baseline.engine \
|
|
--input test_data/*.jpg --batch-size 8 --height 256 --width 128 --output trt_output
|
|
```
|
|
|
|
3. Run `demo/demo.py` to get fastreid model features with the same input images, then verify that TensorRT and PyTorch are computing the same value for the network.
|
|
|
|
```python
|
|
np.testing.assert_allclose(torch_out, trt_out, rtol=1e-3, atol=1e-6)
|
|
```
|
|
|
|
Notice: The int8 mode in tensorRT runtime is not supported now and there are some bugs in calibrator. Need help!
|
|
|
|
</details>
|
|
|
|
## Acknowledgements
|
|
|
|
Thank to [CPFLAME](https://github.com/CPFLAME), [gcong18](https://github.com/gcong18), [YuxiangJohn](https://github.com/YuxiangJohn) and [wiggin66](https://github.com/wiggin66) at JDAI Model Acceleration Group for help in PyTorch model converting.
|