mirror of
https://github.com/JosephKJ/OWOD.git
synced 2025-06-03 14:50:40 +08:00
103 lines
4.9 KiB
Markdown
103 lines
4.9 KiB
Markdown
# Deployment
|
|
|
|
## Caffe2 Deployment
|
|
We currently support converting a detectron2 model to Caffe2 format through ONNX.
|
|
The converted Caffe2 model is able to run without detectron2 dependency in either Python or C++.
|
|
It has a runtime optimized for CPU & mobile inference, but not for GPU inference.
|
|
|
|
Caffe2 conversion requires PyTorch ≥ 1.4 and ONNX ≥ 1.6.
|
|
|
|
### Coverage
|
|
|
|
It supports 3 most common meta architectures: `GeneralizedRCNN`, `RetinaNet`, `PanopticFPN`,
|
|
and most official models under these 3 meta architectures.
|
|
|
|
Users' custom extensions under these architectures (added through registration) are supported
|
|
as long as they do not contain control flow or operators not available in Caffe2 (e.g. deformable convolution).
|
|
For example, custom backbones and heads are often supported out of the box.
|
|
|
|
### Usage
|
|
|
|
The conversion APIs are documented at [the API documentation](../modules/export).
|
|
We provide a tool, `caffe2_converter.py` as an example that uses
|
|
these APIs to convert a standard model.
|
|
|
|
To convert an official Mask R-CNN trained on COCO, first
|
|
[prepare the COCO dataset](builtin_datasets.md), then pick the model from [Model Zoo](../../MODEL_ZOO.md), and run:
|
|
```
|
|
cd tools/deploy/ && ./caffe2_converter.py --config-file ../../configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml \
|
|
--output ./caffe2_model --run-eval \
|
|
MODEL.WEIGHTS detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl \
|
|
MODEL.DEVICE cpu
|
|
```
|
|
|
|
Note that:
|
|
1. The conversion needs valid weights & sample inputs to trace the model. That's why the script requires the dataset.
|
|
You can modify the script to obtain sample inputs in other ways.
|
|
2. With the `--run-eval` flag, it will evaluate the converted models to verify its accuracy.
|
|
The accuracy is typically slightly different (within 0.1 AP) from PyTorch due to
|
|
numerical precisions between different implementations.
|
|
It's recommended to always verify the accuracy in case the conversion is not successful.
|
|
|
|
The converted model is available at the specified `caffe2_model/` directory. Two files `model.pb`
|
|
and `model_init.pb` that contain network structure and network parameters are necessary for deployment.
|
|
These files can then be loaded in C++ or Python using Caffe2's APIs.
|
|
|
|
The script generates `model.svg` file which contains a visualization of the network.
|
|
You can also load `model.pb` to tools such as [netron](https://github.com/lutzroeder/netron) to visualize it.
|
|
|
|
### Use the model in C++/Python
|
|
|
|
The model can be loaded in C++. An example [caffe2_mask_rcnn.cpp](../../tools/deploy/) is given,
|
|
which performs CPU/GPU inference using `COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x`.
|
|
|
|
The C++ example needs to be built with:
|
|
* PyTorch with caffe2 inside
|
|
* gflags, glog, opencv
|
|
* protobuf library that match the version used by PyTorch (3.6 for PyTorch 1.5, 3.11 for PyTorch 1.6)
|
|
* MKL headers if caffe2 is built with MKL
|
|
|
|
The following can compile the example inside [official detectron2 docker](../../docker/):
|
|
```
|
|
# install dependencies
|
|
sudo apt update && sudo apt install libgflags-dev libgoogle-glog-dev libopencv-dev
|
|
pip install mkl-include
|
|
|
|
# install the correct version of protobuf:
|
|
wget https://github.com/protocolbuffers/protobuf/releases/download/v3.11.4/protobuf-cpp-3.11.4.tar.gz && tar xf protobuf-cpp-3.11.4.tar.gz
|
|
cd protobuf-3.11.4
|
|
export CXXFLAGS=-D_GLIBCXX_USE_CXX11_ABI=0
|
|
./configure --prefix=$HOME/.local && make && make install
|
|
export CPATH=$HOME/.local/include
|
|
export LIBRARY_PATH=$HOME/.local/lib
|
|
export LD_LIBRARY_PATH=$HOME/.local/lib
|
|
|
|
# build the program:
|
|
export CMAKE_PREFIX_PATH=$HOME/.local/lib/python3.6/site-packages/torch/
|
|
mkdir build && cd build
|
|
cmake -DTORCH_CUDA_ARCH_LIST=$TORCH_CUDA_ARCH_LIST .. && make
|
|
|
|
# To run:
|
|
./caffe2_mask_rcnn --predict_net=./model.pb --init_net=./model_init.pb --input=input.jpg
|
|
```
|
|
|
|
Note that:
|
|
|
|
* All converted models (the .pb files) take two input tensors:
|
|
"data" is an NCHW image, and "im_info" is an Nx3 tensor consisting of (height, width, 1.0) for
|
|
each image (the shape of "data" might be larger than that in "im_info" due to padding).
|
|
|
|
* The converted models do not contain post-processing operations that
|
|
transform raw layer outputs into formatted predictions.
|
|
For example, the command in this tutorial only produces raw outputs (28x28 masks) from the final
|
|
layers that are not post-processed, because in actual deployment, an application often needs
|
|
its custom lightweight post-processing, so this step is left for users.
|
|
|
|
To use the converted model in python,
|
|
we provide a python wrapper around the converted model, in the
|
|
[Caffe2Model.\_\_call\_\_](../modules/export.html#detectron2.export.Caffe2Model.__call__) method.
|
|
This method has an interface that's identical to the [pytorch versions of models](./models.md),
|
|
and it internally applies pre/post-processing code to match the formats.
|
|
This wrapper can serve as a reference for how to use caffe2's python API,
|
|
or for how to implement pre/post-processing in actual deployment.
|