[Doc]: Add document of TensorRT and onnxruntime custom ops (#920)

* update tensorrt plugin document

* add onnxruntime custom ops document

* format document

* add release note to onnxruntime_op and tensorrt_plugin

* update document

* add deployment.rst

* add grid_sampler onnxruntime document

* fix lint

* add allow_different_nesting tag

* add allow_different_nesting tag

Co-authored-by: maningsheng <maningsheng@sensetime.com>
pull/942/head
q.yao 2021-04-12 14:06:28 +08:00 committed by GitHub
parent 2fadb1a5ec
commit e478e9ff7f
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
7 changed files with 429 additions and 10 deletions

View File

@ -33,7 +33,8 @@ repos:
rev: 2.1.4
hooks:
- id: markdownlint
args: ["-r", "~MD002,~MD013,~MD029,~MD033,~MD034"]
args: ["-r", "~MD002,~MD013,~MD029,~MD033,~MD034",
"-t", "allow_different_nesting"]
- repo: https://github.com/myint/docformatter
rev: v1.3.1
hooks:

View File

@ -0,0 +1,11 @@
Deployment
========
.. toctree::
:maxdepth: 2
onnx.md
onnxruntime_op.md
onnxruntime_custom_ops.md
tensorrt_plugin.md
tensorrt_custom_ops.md

View File

@ -17,6 +17,7 @@ Contents
cnn.md
ops.md
build.md
deployment.rst
trouble_shooting.md
api.rst

View File

@ -0,0 +1,173 @@
# Onnxruntime Custom Ops
<!-- TOC -->
- [Onnxruntime Custom Ops](#onnxruntime-custom-ops)
- [SoftNMS](#softnms)
- [Description](#description)
- [Parameters](#parameters)
- [Inputs](#inputs)
- [Outputs](#outputs)
- [Type Constraints](#type-constraints)
- [RoIAlign](#roialign)
- [Description](#description-1)
- [Parameters](#parameters-1)
- [Inputs](#inputs-1)
- [Outputs](#outputs-1)
- [Type Constraints](#type-constraints-1)
- [NMS](#nms)
- [Description](#description-2)
- [Parameters](#parameters-2)
- [Inputs](#inputs-2)
- [Outputs](#outputs-2)
- [Type Constraints](#type-constraints-2)
- [grid_sampler](#grid_sampler)
- [Description](#description-3)
- [Parameters](#parameters-3)
- [Inputs](#inputs-3)
- [Outputs](#outputs-3)
- [Type Constraints](#type-constraints-3)
<!-- TOC -->
## SoftNMS
### Description
Perform soft NMS on `boxes` with `scores`. Read [Soft-NMS -- Improving Object Detection With One Line of Code](https://arxiv.org/abs/1704.04503) for detail.
### Parameters
| Type | Parameter | Description |
| ------- | --------------- | -------------------------------------------------------------- |
| `float` | `iou_threshold` | IoU threshold for NMS |
| `float` | `sigma` | hyperparameter for gaussian method |
| `float` | `min_score` | score filter threshold |
| `int` | `method` | method to do the nms, (0: `naive`, 1: `linear`, 2: `gaussian`) |
| `int` | `offset` | `boxes` width or height is (x2 - x1 + offset). (0 or 1) |
### Inputs
<dl>
<dt><tt>boxes</tt>: T</dt>
<dd>Input boxes. 2-D tensor of shape (N, 4). N is the number of boxes.</dd>
<dt><tt>scores</tt>: T</dt>
<dd>Input scores. 1-D tensor of shape (N, ).</dd>
</dl>
### Outputs
<dl>
<dt><tt>dets</tt>: tensor(int64)</dt>
<dd>Output boxes and scores. 2-D tensor of shape (num_valid_boxes, 5), [[x1, y1, x2, y2, score], ...]. num_valid_boxes is the number of valid boxes.</dd>
<dt><tt>indices</tt>: T</dt>
<dd>Output indices. 1-D tensor of shape (num_valid_boxes, ).</dd>
</dl>
### Type Constraints
- T:tensor(float32)
## RoIAlign
### Description
Perform RoIAlign on output feature, used in bbox_head of most two-stage detectors.
### Parameters
| Type | Parameter | Description |
| ------- | ---------------- | ------------------------------------------------------------------------------------------------------------- |
| `int` | `output_height` | height of output roi |
| `int` | `output_width` | width of output roi |
| `float` | `spatial_scale` | used to scale the input boxes |
| `int` | `sampling_ratio` | number of input samples to take for each output sample. `0` means to take samples densely for current models. |
| `str` | `mode` | pooling mode in each bin. `avg` or `max` |
| `int` | `aligned` | If `aligned=0`, use the legacy implementation in MMDetection. Else, align the results more perfectly. |
### Inputs
<dl>
<dt><tt>input</tt>: T</dt>
<dd>Input feature map; 4D tensor of shape (N, C, H, W), where N is the batch size, C is the numbers of channels, H and W are the height and width of the data.</dd>
<dt><tt>rois</tt>: T</dt>
<dd>RoIs (Regions of Interest) to pool over; 2-D tensor of shape (num_rois, 5) given as [[batch_index, x1, y1, x2, y2], ...]. The RoIs' coordinates are the coordinate system of input.</dd>
</dl>
### Outputs
<dl>
<dt><tt>feat</tt>: T</dt>
<dd>RoI pooled output, 4-D tensor of shape (num_rois, C, output_height, output_width). The r-th batch element feat[r-1] is a pooled feature map corresponding to the r-th RoI RoIs[r-1].<dd>
</dl>
### Type Constraints
- T:tensor(float32)
## NMS
### Description
Filter out boxes has high IoU overlap with previously selected boxes.
### Parameters
| Type | Parameter | Description |
| ------- | --------------- | ---------------------------------------------------------------------------------------------------------------- |
| `float` | `iou_threshold` | The threshold for deciding whether boxes overlap too much with respect to IoU. Value range [0, 1]. Default to 0. |
| `int` | `offset` | 0 or 1, boxes' width or height is (x2 - x1 + offset). |
### Inputs
<dl>
<dt><tt>bboxes</tt>: T</dt>
<dd>Input boxes. 2-D tensor of shape (num_boxes, 4). num_boxes is the number of input boxes.</dd>
<dt><tt>scores</tt>: T</dt>
<dd>Input scores. 1-D tensor of shape (num_boxes, ).</dd>
</dl>
### Outputs
<dl>
<dt><tt>indices</tt>: tensor(int32, Linear)</dt>
<dd>Selected indices. 1-D tensor of shape (num_valid_boxes, ). num_valid_boxes is the number of valid boxes.</dd>
</dl>
### Type Constraints
- T:tensor(float32)
## grid_sampler
### Description
Perform sample from `input` with pixel locations from `grid`.
### Parameters
| Type | Parameter | Description |
| ----- | -------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `int` | `interpolation_mode` | Interpolation mode to calculate output values. (0: `bilinear` , 1: `nearest`) |
| `int` | `padding_mode` | Padding mode for outside grid values. (0: `zeros`, 1: `border`, 2: `reflection`) |
| `int` | `align_corners` | If `align_corners=1`, the extrema (`-1` and `1`) are considered as referring to the center points of the input's corner pixels. If `align_corners=0`, they are instead considered as referring to the corner points of the input's corner pixels, making the sampling more resolution agnostic. |
### Inputs
<dl>
<dt><tt>input</tt>: T</dt>
<dd>Input feature; 4-D tensor of shape (N, C, inH, inW), where N is the batch size, C is the numbers of channels, inH and inW are the height and width of the data.</dd>
<dt><tt>grid</tt>: T</dt>
<dd>Input offset; 4-D tensor of shape (N, outH, outW, 2), where outH and outW is the height and width of offset and output. </dd>
</dl>
### Outputs
<dl>
<dt><tt>output</tt>: T</dt>
<dd>Output feature; 4-D tensor of shape (N, C, outH, outW).</dd>
</dl>
### Type Constraints
- T:tensor(float32, Linear)

View File

@ -15,10 +15,12 @@
## List of operators for ONNX Runtime supported in MMCV
| Operator | CPU | GPU | Note |
| :------: | :---: | :---: | :-------------------------------------------------------------------------------------------------: |
| SoftNMS | Y | N | commit [94810f](https://github.com/open-mmlab/mmcv/commit/94810f2297871d0ea3ca650dcb2e842f5374d998) |
| RoiAlign | Y | N | None |
| Operator | CPU | GPU | MMCV Releases |
| :----------------------------------------------------: | :---: | :---: | :-----------: |
| [SoftNMS](onnxruntime_custom_ops.md#softnms) | Y | N | 1.2.3 |
| [RoIAlign](onnxruntime_custom_ops.md#roialign) | Y | N | 1.2.5 |
| [NMS](onnxruntime_custom_ops.md#nms) | Y | N | 1.2.7 |
| [grid_sampler](onnxruntime_custom_ops.md#grid_sampler) | Y | N | master |
## How to build custom operators for ONNX Runtime

View File

@ -0,0 +1,229 @@
# TensorRT Custom Ops
<!-- TOC -->
- [TensorRT Custom Ops](#tensorrt-custom-ops)
- [MMCVRoIAlign](#mmcvroialign)
- [Description](#description)
- [Parameters](#parameters)
- [Inputs](#inputs)
- [Outputs](#outputs)
- [Type Constraints](#type-constraints)
- [ScatterND](#scatternd)
- [Description](#description-1)
- [Parameters](#parameters-1)
- [Inputs](#inputs-1)
- [Outputs](#outputs-1)
- [Type Constraints](#type-constraints-1)
- [NonMaxSuppression](#nonmaxsuppression)
- [Description](#description-2)
- [Parameters](#parameters-2)
- [Inputs](#inputs-2)
- [Outputs](#outputs-2)
- [Type Constraints](#type-constraints-2)
- [MMCVDeformConv2d](#mmcvdeformconv2d)
- [Description](#description-3)
- [Parameters](#parameters-3)
- [Inputs](#inputs-3)
- [Outputs](#outputs-3)
- [Type Constraints](#type-constraints-3)
- [grid_sampler](#grid_sampler)
- [Description](#description-4)
- [Parameters](#parameters-4)
- [Inputs](#inputs-4)
- [Outputs](#outputs-4)
- [Type Constraints](#type-constraints-4)
<!-- TOC -->
## MMCVRoIAlign
### Description
Perform RoIAlign on output feature, used in bbox_head of most two stage
detectors.
### Parameters
| Type | Parameter | Description |
| ------- | ---------------- | ------------------------------------------------------------------------------------------------------------- |
| `int` | `output_height` | height of output roi |
| `int` | `output_width` | width of output roi |
| `float` | `spatial_scale` | used to scale the input boxes |
| `int` | `sampling_ratio` | number of input samples to take for each output sample. `0` means to take samples densely for current models. |
| `str` | `mode` | pooling mode in each bin. `avg` or `max` |
| `int` | `aligned` | If `aligned=0`, use the legacy implementation in MMDetection. Else, align the results more perfectly. |
### Inputs
<dl>
<dt><tt>inputs[0]</tt>: T</dt>
<dd>Input feature map; 4D tensor of shape (N, C, H, W), where N is the batch size, C is the numbers of channels, H and W are the height and width of the data.</dd>
<dt><tt>inputs[1]</tt>: T</dt>
<dd>RoIs (Regions of Interest) to pool over; 2-D tensor of shape (num_rois, 5) given as [[batch_index, x1, y1, x2, y2], ...]. The RoIs' coordinates are the coordinate system of inputs[0].</dd>
</dl>
### Outputs
<dl>
<dt><tt>outputs[0]</tt>: T</dt>
<dd>RoI pooled output, 4-D tensor of shape (num_rois, C, output_height, output_width). The r-th batch element output[0][r-1] is a pooled feature map corresponding to the r-th RoI inputs[1][r-1].<dd>
</dl>
### Type Constraints
- T:tensor(float32, Linear)
## ScatterND
### Description
ScatterND takes three inputs `data` tensor of rank r >= 1, `indices` tensor of rank q >= 1, and `updates` tensor of rank q + r - indices.shape[-1] - 1. The output of the operation is produced by creating a copy of the input `data`, and then updating its value to values specified by updates at specific index positions specified by `indices`. Its output shape is the same as the shape of `data`. Note that `indices` should not have duplicate entries. That is, two or more updates for the same index-location is not supported.
The `output` is calculated via the following equation:
```python
output = np.copy(data)
update_indices = indices.shape[:-1]
for idx in np.ndindex(update_indices):
output[indices[idx]] = updates[idx]
```
### Parameters
None
### Inputs
<dl>
<dt><tt>inputs[0]</tt>: T</dt>
<dd>Tensor of rank r>=1.</dd>
<dt><tt>inputs[1]</tt>: tensor(int32, Linear)</dt>
<dd>Tensor of rank q>=1.</dd>
<dt><tt>inputs[2]</tt>: T</dt>
<dd>Tensor of rank q + r - indices_shape[-1] - 1.</dd>
</dl>
### Outputs
<dl>
<dt><tt>outputs[0]</tt>: T</dt>
<dd>Tensor of rank r >= 1.</dd>
</dl>
### Type Constraints
- T:tensor(float32, Linear), tensor(int32, Linear)
## NonMaxSuppression
### Description
Filter out boxes has high IoU overlap with previously selected boxes or low score. Output the indices of valid boxes. Indices of invalid boxes will be filled with -1.
### Parameters
| Type | Parameter | Description |
| ------- | ---------------------------- | ------------------------------------------------------------------------------------------------------------------------------------ |
| `int` | `center_point_box` | 0 - the box data is supplied as [y1, x1, y2, x2], 1-the box data is supplied as [x_center, y_center, width, height]. |
| `int` | `max_output_boxes_per_class` | The maximum number of boxes to be selected per batch per class. Default to 0, number of output boxes equal to number of input boxes. |
| `float` | `iou_threshold` | The threshold for deciding whether boxes overlap too much with respect to IoU. Value range [0, 1]. Default to 0. |
| `float` | `score_threshold` | The threshold for deciding when to remove boxes based on score. |
| `int` | `offset` | 0 or 1, boxes' width or height is (x2 - x1 + offset). |
### Inputs
<dl>
<dt><tt>inputs[0]</tt>: T</dt>
<dd>Input boxes. 3-D tensor of shape (num_batches, spatial_dimension, 4).</dd>
<dt><tt>inputs[1]</tt>: T</dt>
<dd>Input scores. 3-D tensor of shape (num_batches, num_classes, spatial_dimension).</dd>
</dl>
### Outputs
<dl>
<dt><tt>outputs[0]</tt>: tensor(int32, Linear)</dt>
<dd>Selected indices. 2-D tensor of shape (num_selected_indices, 3) as [[batch_index, class_index, box_index], ...].</dd>
<dd>num_selected_indices=num_batches* num_classes* min(max_output_boxes_per_class, spatial_dimension).</dd>
<dd>All invalid indices will be filled with -1.</dd>
</dl>
### Type Constraints
- T:tensor(float32, Linear)
## MMCVDeformConv2d
### Description
Perform Deformable Convolution on input feature, read [Deformable Convolutional Network](https://arxiv.org/abs/1703.06211) for detail.
### Parameters
| Type | Parameter | Description |
| -------------- | ------------------ | --------------------------------------------------------------------------------------------------------------------------------- |
| `list of ints` | `stride` | The stride of the convolving kernel. (sH, sW) |
| `list of ints` | `padding` | Paddings on both sides of the input. (padH, padW) |
| `list of ints` | `dilation` | The spacing between kernel elements. (dH, dW) |
| `int` | `deformable_group` | Groups of deformable offset. |
| `int` | `group` | Split input into groups. `input_channel` should be divisible by the number of groups. |
| `int` | `im2col_step` | DeformableConv2d use im2col to compute convolution. im2col_step is used to split input and offset, reduce memory usage of column. |
### Inputs
<dl>
<dt><tt>inputs[0]</tt>: T</dt>
<dd>Input feature; 4-D tensor of shape (N, C, inH, inW), where N is the batch size, C is the numbers of channels, inH and inW are the height and width of the data.</dd>
<dt><tt>inputs[1]</tt>: T</dt>
<dd>Input offset; 4-D tensor of shape (N, deformable_group* 2* kH* kW, outH, outW), where kH and kW is the height and width of weight, outH and outW is the height and width of offset and output.</dd>
<dt><tt>inputs[2]</tt>: T</dt>
<dd>Input weight; 4-D tensor of shape (output_channel, input_channel, kH, kW).</dd>
</dl>
### Outputs
<dl>
<dt><tt>outputs[0]</tt>: T</dt>
<dd>Output feature; 4-D tensor of shape (N, output_channel, outH, outW).</dd>
</dl>
### Type Constraints
- T:tensor(float32, Linear)
## grid_sampler
### Description
Perform sample from `input` with pixel locations from `grid`.
### Parameters
| Type | Parameter | Description |
| ----- | -------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `int` | `interpolation_mode` | Interpolation mode to calculate output values. (0: `bilinear` , 1: `nearest`) |
| `int` | `padding_mode` | Padding mode for outside grid values. (0: `zeros`, 1: `border`, 2: `reflection`) |
| `int` | `align_corners` | If `align_corners=1`, the extrema (`-1` and `1`) are considered as referring to the center points of the input's corner pixels. If `align_corners=0`, they are instead considered as referring to the corner points of the input's corner pixels, making the sampling more resolution agnostic. |
### Inputs
<dl>
<dt><tt>inputs[0]</tt>: T</dt>
<dd>Input feature; 4-D tensor of shape (N, C, inH, inW), where N is the batch size, C is the numbers of channels, inH and inW are the height and width of the data.</dd>
<dt><tt>inputs[1]</tt>: T</dt>
<dd>Input offset; 4-D tensor of shape (N, outH, outW, 2), where outH and outW is the height and width of offset and output. </dd>
</dl>
### Outputs
<dl>
<dt><tt>outputs[0]</tt>: T</dt>
<dd>Output feature; 4-D tensor of shape (N, C, outH, outW).</dd>
</dl>
### Type Constraints
- T:tensor(float32, Linear)

View File

@ -24,11 +24,13 @@ To ease the deployment of trained models with custom operators from `mmcv.ops` u
## List of TensorRT plugins supported in MMCV
| ONNX Operator | TensorRT Plugin | Note |
| :---------------: | :-------------------: | :---: |
| RoiAlign | MMCVRoiAlign | Y |
| ScatterND | ScatterND | Y |
| NonMaxSuppression | MMCVNonMaxSuppression | WIP |
| ONNX Operator | TensorRT Plugin | MMCV Releases |
| :---------------: | :-------------------------------------------------------------: | :-----------: |
| MMCVRoiAlign | [MMCVRoiAlign](./tensorrt_custom_ops.md#mmcvroialign) | 1.2.6 |
| ScatterND | [ScatterND](./tensorrt_custom_ops.md#scatternd) | 1.2.6 |
| NonMaxSuppression | [NonMaxSuppression](./tensorrt_custom_ops.md#nonmaxsuppression) | 1.3.0 |
| MMCVDeformConv2d | [MMCVDeformConv2d](./tensorrt_custom_ops.md#mmcvdeformconv2d) | 1.3.0 |
| grid_sampler | [grid_sampler](./tensorrt_custom_ops.md#grid-sampler) | master |
Notes