mmyolo/docs/en/deploy/basic_deployment_guide.md

# Basic Deployment Guide

## Introduction of MMDeploy

MMDeploy is an open-source deep learning model deployment toolset. It is a part of the [OpenMMLab](https://openmmlab.com/) project, and provides **a unified experience of exporting different models** to various platforms and devices of the OpenMMLab series libraries. Using MMDeploy, developers can easily export the specific compiled SDK they need from the training result, which saves a lot of effort.

More detailed introduction and guides can be found [here](https://github.com/open-mmlab/mmdeploy/blob/dev-1.x/docs/en/get_started.md)

## Supported Algorithms

Currently our deployment kit supports on the following models and backends:

| Model  | Task            | OnnxRuntime | TensorRT |                              Model config                               |
| :----- | :-------------- | :---------: | :------: | :---------------------------------------------------------------------: |
| YOLOv5 | ObjectDetection |      Y      |    Y     | [config](https://github.com/open-mmlab/mmyolo/tree/main/configs/yolov5) |
| YOLOv6 | ObjectDetection |      Y      |    Y     | [config](https://github.com/open-mmlab/mmyolo/tree/main/configs/yolov6) |
| YOLOX  | ObjectDetection |      Y      |    Y     | [config](https://github.com/open-mmlab/mmyolo/tree/main/configs/yolox)  |
| RTMDet | ObjectDetection |      Y      |    Y     | [config](https://github.com/open-mmlab/mmyolo/tree/main/configs/rtmdet) |

Note: ncnn and other inference backends support are coming soon.

## How to Write Config for MMYOLO

All config files related to the deployment are located at [`configs/deploy`](../../../configs/deploy/).

You only need to change the relative data processing part in the model config file to support either static or dynamic input for your model. Besides, MMDeploy integrates the post-processing parts as customized ops, you can modify the strategy in `post_processing` parameter in `codebase_config`.

Here is the detail description:

```python
codebase_config = dict(
    type='mmyolo',
    task='ObjectDetection',
    model_type='end2end',
    post_processing=dict(
        score_threshold=0.05,
        confidence_threshold=0.005,
        iou_threshold=0.5,
        max_output_boxes_per_class=200,
        pre_top_k=5000,
        keep_top_k=100,
        background_label_id=-1),
    module=['mmyolo.deploy'])
```

- `score_threshold`: set the score threshold to filter candidate bboxes before `nms`
- `confidence_threshold`: set the confidence threshold to filter candidate bboxes before `nms`
- `iou_threshold`: set the `iou` threshold for removing duplicates in `nums`
- `max_output_boxes_per_class`: set the maximum number of bboxes for each class
- `pre_top_k`: set the number of fixedcandidate bboxes before `nms`, sorted by scores
- `keep_top_k`: set the number of output candidate bboxs after `nms`
- `background_label_id`: set to `-1` as MMYOLO has no background class information

### Configuration for Static Inputs

#### 1. Model Config

Taking `YOLOv5` of MMYOLO as an example, here are the details:

```python
_base_ = '../../yolov5/yolov5_s-v61_syncbn_8xb16-300e_coco.py'

test_pipeline = [
    dict(type='LoadImageFromFile', file_client_args=_base_.file_client_args),
    dict(
        type='LetterResize',
        scale=_base_.img_scale,
        allow_scale_up=False,
        use_mini_pad=False,
    ),
    dict(type='LoadAnnotations', with_bbox=True, _scope_='mmdet'),
    dict(
        type='mmdet.PackDetInputs',
        meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',
                   'scale_factor', 'pad_param'))
]

test_dataloader = dict(
    dataset=dict(pipeline=test_pipeline, batch_shapes_cfg=None))
```

`_base_ = '../../yolov5/yolov5_s-v61_syncbn_8xb16-300e_coco.py'` inherits the model config in the training stage.

`test_pipeline` adds the data processing piple for the deployment, `LetterResize` controls the size of the input images and the input for the converted model

`test_dataloader` adds the dataloader config for the deployment, `batch_shapes_cfg` decides whether to use the `batch_shapes` strategy. More details can be found at [yolov5 configs](../user_guides/config.md)

#### 2. Deployment Config

Here we still use the `YOLOv5` in MMYOLO as the example. We can use [`detection_onnxruntime_static.py`](https://github.com/open-mmlab/mmyolo/blob/main/configs/deploy/detection_onnxruntime_static.py) as the config to deploy \`YOLOv5\` to \`ONNXRuntim\` with static inputs.

```python
_base_ = ['./base_static.py']
codebase_config = dict(
    type='mmyolo',
    task='ObjectDetection',
    model_type='end2end',
    post_processing=dict(
        score_threshold=0.05,
        confidence_threshold=0.005,
        iou_threshold=0.5,
        max_output_boxes_per_class=200,
        pre_top_k=5000,
        keep_top_k=100,
        background_label_id=-1),
    module=['mmyolo.deploy'])
backend_config = dict(type='onnxruntime')
```

`backend_config` indicates the deployment backend with `type='onnxruntime'`, other information can be referred from the third section.

To deploy the `YOLOv5` to `TensorRT`, please refer to the [`detection_tensorrt_static-640x640.py`](https://github.com/open-mmlab/mmyolo/blob/main/configs/deploy/detection_tensorrt_static-640x640.py) as follows.

```python
_base_ = ['./base_static.py']
onnx_config = dict(input_shape=(640, 640))
backend_config = dict(
    type='tensorrt',
    common_config=dict(fp16_mode=False, max_workspace_size=1 << 30),
    model_inputs=[
        dict(
            input_shapes=dict(
                input=dict(
                    min_shape=[1, 3, 640, 640],
                    opt_shape=[1, 3, 640, 640],
                    max_shape=[1, 3, 640, 640])))
    ])
use_efficientnms = False
```

`backend_config` indices the backend with `type=‘tensorrt’`.

Different from `ONNXRuntime` deployment configuration, `TensorRT` needs to specify the input image size and the parameters required to build the engine file, including:

- `onnx_config` specifies the input shape as `input_shape=(640, 640)`
- `fp16_mode=False` and `max_workspace_size=1 << 30` in `backend_config['common_config']` indicates whether to build the engine in the parameter format of `fp16`, and the maximum video memory for the current `gpu` device, respectively. The unit is in `GB`. For detailed configuration of `fp16`, please refer to the [`detection_tensorrt-fp16_static-640x640.py`](https://github.com/open-mmlab/mmyolo/blob/main/configs/deploy/detection_tensorrt-fp16_static-640x640.py)
- The `min_shape`/`opt_shape`/`max_shape` in `backend_config['model_inputs']['input_shapes']['input']` should remain the same under static input, the default is `[1, 3, 640, 640]`.

`use_efficientnms` is a new configuration introduced by the `MMYOLO` series, indicating whether to enable `Efficient NMS Plugin` to replace `TRTBatchedNMS plugin` in `MMDeploy` when exporting `onnx`.

You can refer to the official [efficient NMS plugins](https://github.com/NVIDIA/TensorRT/blob/main/plugin/efficientNMSPlugin/README.md) by `TensorRT` for more details.

Note: this out-of-box feature is **only available in TensorRT>=8.0**, no need to compile it by yourself.

### Configuration for Dynamic Inputs

#### 1. Model Config

When you deploy a dynamic input model, you don't need to modify any model configuration files but the deployment configuration files.

#### 2. Deployment Config

To deploy the `YOLOv5` in MMYOLO to `ONNXRuntime`, please refer to the [`detection_onnxruntime_dynamic.py`](https://github.com/open-mmlab/mmyolo/blob/main/configs/deploy/detection_onnxruntime_dynamic.py).

```python
_base_ = ['./base_dynamic.py']
codebase_config = dict(
    type='mmyolo',
    task='ObjectDetection',
    model_type='end2end',
    post_processing=dict(
        score_threshold=0.05,
        confidence_threshold=0.005,
        iou_threshold=0.5,
        max_output_boxes_per_class=200,
        pre_top_k=5000,
        keep_top_k=100,
        background_label_id=-1),
    module=['mmyolo.deploy'])
backend_config = dict(type='onnxruntime')
```

`backend_config` indicates the backend with `type='onnxruntime'`. Other parameters stay the same as the static input section.

To deploy the `YOLOv5` to `TensorRT`, please refer to the [`detection_tensorrt_dynamic-192x192-960x960.py`](https://github.com/open-mmlab/mmyolo/blob/main/configs/deploy/detection_tensorrt_dynamic-192x192-960x960.py).

```python
_base_ = ['./base_dynamic.py']
backend_config = dict(
    type='tensorrt',
    common_config=dict(fp16_mode=False, max_workspace_size=1 << 30),
    model_inputs=[
        dict(
            input_shapes=dict(
                input=dict(
                    min_shape=[1, 3, 192, 192],
                    opt_shape=[1, 3, 640, 640],
                    max_shape=[1, 3, 960, 960])))
    ])
use_efficientnms = False
```

`backend_config` indicates the backend with `type='tensorrt'`. Since the dynamic and static inputs are different in `TensorRT`, please check the details at [TensorRT dynamic input official introduction](https://docs.nvidia.com/deeplearning/tensorrt/archives/tensorrt-843/developer-guide/index.html#work_dynamic_shapes).

`TensorRT` deployment requires you to specify `min_shape`, `opt_shape` , and `max_shape`. `TensorRT` limits the size of the input image between `min_shape` and `max_shape`.

`min_shape` is the minimum size of the input image. `opt_shape` is the common size of the input image, inference performance is best under this size. `max_shape` is the maximum size of the input image.

`use_efficientnms` configuration is the same as the `TensorRT` static input configuration in the previous section.

### INT8 Quantization Support

Note: Int8 quantization support will soon be released.

## How to Convert Model

### Usage

Set the root directory of `MMDeploy` as an env parameter `MMDEPLOY_DIR` using `export MMDEPLOY_DIR=/the/root/path/of/MMDeploy` command.

```shell
python3 ${MMDEPLOY_DIR}/tools/deploy.py \
    ${DEPLOY_CFG_PATH} \
    ${MODEL_CFG_PATH} \
    ${MODEL_CHECKPOINT_PATH} \
    ${INPUT_IMG} \
    --test-img ${TEST_IMG} \
    --work-dir ${WORK_DIR} \
    --calib-dataset-cfg ${CALIB_DATA_CFG} \
    --device ${DEVICE} \
    --log-level INFO \
    --show \
    --dump-info
```

### Parameter Description

- `deploy_cfg`: set the deployment config path of MMDeploy for the model, including the type of inference framework, whether quantize, whether the input shape is dynamic, etc. There may be a reference relationship between configuration files, e.g. `configs/deploy/detection_onnxruntime_static.py`
- `model_cfg`: set the MMYOLO model config path, e.g. `configs/deploy/model/yolov5_s-deploy.py`, regardless of the path to MMDeploy
- `checkpoint`: set the torch model path. It can start with `http/https`, more details are available in `mmengine.fileio` apis
- `img`: set the path to the image or point cloud file used for testing during model conversion
- `--test-img`: set the image file that used to test model. If not specified, it will be set to `None`
- `--work-dir`: set the work directory that used to save logs and models
- `--calib-dataset-cfg`: use for calibration only for INT8 mode. If not specified, it will be set to None and use “val” dataset in model config for calibration
- `--device`: set the device used for model conversion. The default is `cpu`, for TensorRT used `cuda:0`
- `--log-level`: set log level which in `'CRITICAL', 'FATAL', 'ERROR', 'WARN', 'WARNING', 'INFO', 'DEBUG', 'NOTSET'`. If not specified, it will be set to `INFO`
- `--show`: show the result on screen or not
- `--dump-info`: output SDK information or not

## How to Evaluate Model

### Usage

After the model is converted to your backend, you can use `${MMDEPLOY_DIR}/tools/test.py` to evaluate the performance.

```shell
python3 ${MMDEPLOY_DIR}/tools/test.py \
    ${DEPLOY_CFG} \
    ${MODEL_CFG} \
    --model ${BACKEND_MODEL_FILES} \
    [--out ${OUTPUT_PKL_FILE}] \
    [--format-only] \
    [--metrics ${METRICS}] \
    [--show] \
    [--show-dir ${OUTPUT_IMAGE_DIR}] \
    [--show-score-thr ${SHOW_SCORE_THR}] \
    --device ${DEVICE} \
    [--cfg-options ${CFG_OPTIONS}] \
    [--metric-options ${METRIC_OPTIONS}]
    [--log2file work_dirs/output.txt]
    [--batch-size ${BATCH_SIZE}]
    [--speed-test] \
    [--warmup ${WARM_UP}] \
    [--log-interval ${LOG_INTERVERL}]
```

### Parameter Description

- `deploy_cfg`: set the deployment config file path
- `model_cfg`: set the MMYOLO model config file path
- `--model`: set the converted model. For example, if we exported a TensorRT model, we need to pass in the file path with the suffix ".engine"
- `--out`: save the output result in pickle format, use only when you need it
- `--format-only`: format the output without evaluating it. It is useful when you want to format the result into a specific format and submit it to a test server
- `--metrics`: use the specific metric supported in MMYOLO to evaluate, such as "proposal" in COCO format data.
- `--show`: show the evaluation result on screen or not
- `--show-dir`: save the evaluation result to this directory, valid only when specified
- `--show-score-thr`: show the threshold for the detected bboxes or not
- `--device`: indicate the device to run the model. Note that some backends limit the running devices. For example, TensorRT must run on CUDA
- `--cfg-options`: pass in additional configs, which will override the current deployment configs
- `--metric-options`: add custom options for metrics. The key-value pair format in xxx=yyy will be the kwargs of the dataset.evaluate() method
- `--log2file`: save the evaluation results (with the speed) to a file
- `--batch-size`: set the batch size for inference, which will override the `samples_per_gpu` in data config. The default value is `1`, however, not every model supports `batch_size > 1`
- `--speed-test`: test the inference speed or not
- `--warmup`: warm up before speed test or not, works only when `speed-test` is specified
- `--log-interval`: set the interval between each log, works only when `speed-test` is specified

Note: other parameters in `${MMDEPLOY_DIR}/tools/test.py` are used for speed test, they will not affect the evaluation results.
-												Add changelog of v0.1.2 (#226)

* Add changelog of v0.1.2

* update version

* fix comments

* fix comments

* update

* update version

* update version
											
										
										
											2022-11-03 17:15:19 +08:00
+								# Basic Deployment Guide
-												[Docs] Deployment translation (#289)

* translate deployment docs

* translate deployment docs part 2

* Update docs/zh_cn/deploy/yolov5_deployment.md

Co-authored-by: Range King <RangeKingHZ@gmail.com>

* Update docs/zh_cn/deploy/yolov5_deployment.md

Co-authored-by: Range King <RangeKingHZ@gmail.com>

* Update docs/zh_cn/deploy/yolov5_deployment.md

Co-authored-by: Range King <RangeKingHZ@gmail.com>

* Update docs/zh_cn/deploy/yolov5_deployment.md

Co-authored-by: Range King <RangeKingHZ@gmail.com>

* Update docs/zh_cn/deploy/yolov5_deployment.md

Co-authored-by: Range King <RangeKingHZ@gmail.com>

* Update docs/zh_cn/deploy/yolov5_deployment.md

Co-authored-by: Range King <RangeKingHZ@gmail.com>

* Update docs/zh_cn/deploy/yolov5_deployment.md

Co-authored-by: Range King <RangeKingHZ@gmail.com>

* Update docs/zh_cn/deploy/yolov5_deployment.md

Co-authored-by: Range King <RangeKingHZ@gmail.com>

* Update docs/en/deploy/basic_deployment_guide.md

Co-authored-by: HinGwenWoong <peterhuang0323@qq.com>

* Update docs/en/deploy/basic_deployment_guide.md

Co-authored-by: HinGwenWoong <peterhuang0323@qq.com>

* Update docs/en/deploy/basic_deployment_guide.md

Co-authored-by: HinGwenWoong <peterhuang0323@qq.com>

* debug link representation in readthedocs

* debug link representation in readthedocs

* debug link representation in readthedocs

Co-authored-by: Range King <RangeKingHZ@gmail.com>
Co-authored-by: HinGwenWoong <peterhuang0323@qq.com>
											
										
										
											2022-11-26 10:07:50 +08:00
 								## Introduction of MMDeploy
 								MMDeploy is an open-source deep learning model deployment toolset. It is a part of the [OpenMMLab](https://openmmlab.com/) project, and provides **a unified experience of exporting different models** to various platforms and devices of the OpenMMLab series libraries. Using MMDeploy, developers can easily export the specific compiled SDK they need from the training result, which saves a lot of effort.
 								More detailed introduction and guides can be found [here](https://github.com/open-mmlab/mmdeploy/blob/dev-1.x/docs/en/get_started.md)
 								## Supported Algorithms
 								Currently our deployment kit supports on the following models and backends:
 								| Model  | Task            | OnnxRuntime | TensorRT |                              Model config                               |
 								| :----- | :-------------- | :---------: | :------: | :---------------------------------------------------------------------: |
 								| YOLOv5 | ObjectDetection |      Y      |    Y     | [config](https://github.com/open-mmlab/mmyolo/tree/main/configs/yolov5) |
 								| YOLOv6 | ObjectDetection |      Y      |    Y     | [config](https://github.com/open-mmlab/mmyolo/tree/main/configs/yolov6) |
 								| YOLOX  | ObjectDetection |      Y      |    Y     | [config](https://github.com/open-mmlab/mmyolo/tree/main/configs/yolox)  |
 								| RTMDet | ObjectDetection |      Y      |    Y     | [config](https://github.com/open-mmlab/mmyolo/tree/main/configs/rtmdet) |
 								Note: ncnn and other inference backends support are coming soon.
 								## How to Write Config for MMYOLO
 								All config files related to the deployment are located at [`configs/deploy`](../../../configs/deploy/).
 								You only need to change the relative data processing part in the model config file to support either static or dynamic input for your model. Besides, MMDeploy integrates the post-processing parts as customized ops, you can modify the strategy in `post_processing` parameter in `codebase_config`.
 								Here is the detail description:
 								```python
 								codebase_config = dict(
 								    type='mmyolo',
 								    task='ObjectDetection',
 								    model_type='end2end',
 								    post_processing=dict(
 								        score_threshold=0.05,
 								        confidence_threshold=0.005,
 								        iou_threshold=0.5,
 								        max_output_boxes_per_class=200,
 								        pre_top_k=5000,
 								        keep_top_k=100,
 								        background_label_id=-1),
 								    module=['mmyolo.deploy'])
 								```
 								- `score_threshold`: set the score threshold to filter candidate bboxes before `nms`
 								- `confidence_threshold`: set the confidence threshold to filter candidate bboxes before `nms`
 								- `iou_threshold`: set the `iou` threshold for removing duplicates in `nums`
 								- `max_output_boxes_per_class`: set the maximum number of bboxes for each class
 								- `pre_top_k`: set the number of fixedcandidate bboxes before `nms`, sorted by scores
 								- `keep_top_k`: set the number of output candidate bboxs after `nms`
 								- `background_label_id`: set to `-1` as MMYOLO has no background class information
 								### Configuration for Static Inputs
 								#### 1. Model Config
 								Taking `YOLOv5` of MMYOLO as an example, here are the details:
 								```python
 								_base_ = '../../yolov5/yolov5_s-v61_syncbn_8xb16-300e_coco.py'
 								test_pipeline = [
 								    dict(type='LoadImageFromFile', file_client_args=_base_.file_client_args),
 								    dict(
 								        type='LetterResize',
 								        scale=_base_.img_scale,
 								        allow_scale_up=False,
 								        use_mini_pad=False,
 								    ),
 								    dict(type='LoadAnnotations', with_bbox=True, _scope_='mmdet'),
 								    dict(
 								        type='mmdet.PackDetInputs',
 								        meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',
 								                   'scale_factor', 'pad_param'))
 								]
 								test_dataloader = dict(
 								    dataset=dict(pipeline=test_pipeline, batch_shapes_cfg=None))
 								```
 								`_base_ = '../../yolov5/yolov5_s-v61_syncbn_8xb16-300e_coco.py'` inherits the model config in the training stage.
 								`test_pipeline` adds the data processing piple for the deployment, `LetterResize` controls the size of the input images and the input for the converted model
 								`test_dataloader` adds the dataloader config for the deployment, `batch_shapes_cfg` decides whether to use the `batch_shapes` strategy. More details can be found at [yolov5 configs](../user_guides/config.md)
 								#### 2. Deployment Config
 								Here we still use the `YOLOv5` in MMYOLO as the example. We can use [`detection_onnxruntime_static.py`](https://github.com/open-mmlab/mmyolo/blob/main/configs/deploy/detection_onnxruntime_static.py) as the config to deploy \`YOLOv5\` to \`ONNXRuntim\` with static inputs.
 								```python
 								_base_ = ['./base_static.py']
 								codebase_config = dict(
 								    type='mmyolo',
 								    task='ObjectDetection',
 								    model_type='end2end',
 								    post_processing=dict(
 								        score_threshold=0.05,
 								        confidence_threshold=0.005,
 								        iou_threshold=0.5,
 								        max_output_boxes_per_class=200,
 								        pre_top_k=5000,
 								        keep_top_k=100,
 								        background_label_id=-1),
 								    module=['mmyolo.deploy'])
 								backend_config = dict(type='onnxruntime')
 								```
 								`backend_config` indicates the deployment backend with `type='onnxruntime'`, other information can be referred from the third section.
 								To deploy the `YOLOv5` to `TensorRT`, please refer to the [`detection_tensorrt_static-640x640.py`](https://github.com/open-mmlab/mmyolo/blob/main/configs/deploy/detection_tensorrt_static-640x640.py) as follows.
 								```python
 								_base_ = ['./base_static.py']
 								onnx_config = dict(input_shape=(640, 640))
 								backend_config = dict(
 								    type='tensorrt',
 								    common_config=dict(fp16_mode=False, max_workspace_size=1 << 30),
 								    model_inputs=[
 								        dict(
 								            input_shapes=dict(
 								                input=dict(
 								                    min_shape=[1, 3, 640, 640],
 								                    opt_shape=[1, 3, 640, 640],
 								                    max_shape=[1, 3, 640, 640])))
 								    ])
 								use_efficientnms = False
 								```
 								`backend_config` indices the backend with `type=‘tensorrt’`.
 								Different from `ONNXRuntime` deployment configuration, `TensorRT` needs to specify the input image size and the parameters required to build the engine file, including:
 								- `onnx_config` specifies the input shape as `input_shape=(640, 640)`
 								- `fp16_mode=False` and `max_workspace_size=1 << 30` in `backend_config['common_config']` indicates whether to build the engine in the parameter format of `fp16`, and the maximum video memory for the current `gpu` device, respectively. The unit is in `GB`. For detailed configuration of `fp16`, please refer to the [`detection_tensorrt-fp16_static-640x640.py`](https://github.com/open-mmlab/mmyolo/blob/main/configs/deploy/detection_tensorrt-fp16_static-640x640.py)
 								- The `min_shape`/`opt_shape`/`max_shape` in `backend_config['model_inputs']['input_shapes']['input']` should remain the same under static input, the default is `[1, 3, 640, 640]`.
 								`use_efficientnms` is a new configuration introduced by the `MMYOLO` series, indicating whether to enable `Efficient NMS Plugin` to replace `TRTBatchedNMS plugin` in `MMDeploy` when exporting `onnx`.
 								You can refer to the official [efficient NMS plugins](https://github.com/NVIDIA/TensorRT/blob/main/plugin/efficientNMSPlugin/README.md) by `TensorRT` for more details.
 								Note: this out-of-box feature is **only available in TensorRT>=8.0**, no need to compile it by yourself.
 								### Configuration for Dynamic Inputs
 								#### 1. Model Config
 								When you deploy a dynamic input model, you don't need to modify any model configuration files but the deployment configuration files.
 								#### 2. Deployment Config
 								To deploy the `YOLOv5` in MMYOLO to `ONNXRuntime`, please refer to the [`detection_onnxruntime_dynamic.py`](https://github.com/open-mmlab/mmyolo/blob/main/configs/deploy/detection_onnxruntime_dynamic.py).
 								```python
 								_base_ = ['./base_dynamic.py']
 								codebase_config = dict(
 								    type='mmyolo',
 								    task='ObjectDetection',
 								    model_type='end2end',
 								    post_processing=dict(
 								        score_threshold=0.05,
 								        confidence_threshold=0.005,
 								        iou_threshold=0.5,
 								        max_output_boxes_per_class=200,
 								        pre_top_k=5000,
 								        keep_top_k=100,
 								        background_label_id=-1),
 								    module=['mmyolo.deploy'])
 								backend_config = dict(type='onnxruntime')
 								```
 								`backend_config` indicates the backend with `type='onnxruntime'`. Other parameters stay the same as the static input section.
 								To deploy the `YOLOv5` to `TensorRT`, please refer to the [`detection_tensorrt_dynamic-192x192-960x960.py`](https://github.com/open-mmlab/mmyolo/blob/main/configs/deploy/detection_tensorrt_dynamic-192x192-960x960.py).
 								```python
 								_base_ = ['./base_dynamic.py']
 								backend_config = dict(
 								    type='tensorrt',
 								    common_config=dict(fp16_mode=False, max_workspace_size=1 << 30),
 								    model_inputs=[
 								        dict(
 								            input_shapes=dict(
 								                input=dict(
 								                    min_shape=[1, 3, 192, 192],
 								                    opt_shape=[1, 3, 640, 640],
 								                    max_shape=[1, 3, 960, 960])))
 								    ])
 								use_efficientnms = False
 								```
 								`backend_config` indicates the backend with `type='tensorrt'`. Since the dynamic and static inputs are different in `TensorRT`, please check the details at [TensorRT dynamic input official introduction](https://docs.nvidia.com/deeplearning/tensorrt/archives/tensorrt-843/developer-guide/index.html#work_dynamic_shapes).
 								`TensorRT` deployment requires you to specify `min_shape`, `opt_shape` , and `max_shape`. `TensorRT` limits the size of the input image between `min_shape` and `max_shape`.
 								`min_shape` is the minimum size of the input image. `opt_shape` is the common size of the input image, inference performance is best under this size. `max_shape` is the maximum size of the input image.
 								`use_efficientnms` configuration is the same as the `TensorRT` static input configuration in the previous section.
 								### INT8 Quantization Support
 								Note: Int8 quantization support will soon be released.
 								## How to Convert Model
 								### Usage
 								Set the root directory of `MMDeploy` as an env parameter `MMDEPLOY_DIR` using `export MMDEPLOY_DIR=/the/root/path/of/MMDeploy` command.
 								```shell
 								python3 ${MMDEPLOY_DIR}/tools/deploy.py \
 								    ${DEPLOY_CFG_PATH} \
 								    ${MODEL_CFG_PATH} \
 								    ${MODEL_CHECKPOINT_PATH} \
 								    ${INPUT_IMG} \
 								    --test-img ${TEST_IMG} \
 								    --work-dir ${WORK_DIR} \
 								    --calib-dataset-cfg ${CALIB_DATA_CFG} \
 								    --device ${DEVICE} \
 								    --log-level INFO \
 								    --show \
 								    --dump-info
 								```
 								### Parameter Description
 								- `deploy_cfg`: set the deployment config path of MMDeploy for the model, including the type of inference framework, whether quantize, whether the input shape is dynamic, etc. There may be a reference relationship between configuration files, e.g. `configs/deploy/detection_onnxruntime_static.py`
 								- `model_cfg`: set the MMYOLO model config path, e.g. `configs/deploy/model/yolov5_s-deploy.py`, regardless of the path to MMDeploy
 								- `checkpoint`: set the torch model path. It can start with `http/https`, more details are available in `mmengine.fileio` apis
 								- `img`: set the path to the image or point cloud file used for testing during model conversion
 								- `--test-img`: set the image file that used to test model. If not specified, it will be set to `None`
 								- `--work-dir`: set the work directory that used to save logs and models
 								- `--calib-dataset-cfg`: use for calibration only for INT8 mode. If not specified, it will be set to None and use “val” dataset in model config for calibration
 								- `--device`: set the device used for model conversion. The default is `cpu`, for TensorRT used `cuda:0`
 								- `--log-level`: set log level which in `'CRITICAL', 'FATAL', 'ERROR', 'WARN', 'WARNING', 'INFO', 'DEBUG', 'NOTSET'`. If not specified, it will be set to `INFO`
 								- `--show`: show the result on screen or not
 								- `--dump-info`: output SDK information or not
 								## How to Evaluate Model
 								### Usage
 								After the model is converted to your backend, you can use `${MMDEPLOY_DIR}/tools/test.py` to evaluate the performance.
 								```shell
 								python3 ${MMDEPLOY_DIR}/tools/test.py \
 								    ${DEPLOY_CFG} \
 								    ${MODEL_CFG} \
 								    --model ${BACKEND_MODEL_FILES} \
 								    [--out ${OUTPUT_PKL_FILE}] \
 								    [--format-only] \
 								    [--metrics ${METRICS}] \
 								    [--show] \
 								    [--show-dir ${OUTPUT_IMAGE_DIR}] \
 								    [--show-score-thr ${SHOW_SCORE_THR}] \
 								    --device ${DEVICE} \
 								    [--cfg-options ${CFG_OPTIONS}] \
 								    [--metric-options ${METRIC_OPTIONS}]
 								    [--log2file work_dirs/output.txt]
 								    [--batch-size ${BATCH_SIZE}]
 								    [--speed-test] \
 								    [--warmup ${WARM_UP}] \
 								    [--log-interval ${LOG_INTERVERL}]
 								```
 								### Parameter Description
 								- `deploy_cfg`: set the deployment config file path
 								- `model_cfg`: set the MMYOLO model config file path
 								- `--model`: set the converted model. For example, if we exported a TensorRT model, we need to pass in the file path with the suffix ".engine"
 								- `--out`: save the output result in pickle format, use only when you need it
 								- `--format-only`: format the output without evaluating it. It is useful when you want to format the result into a specific format and submit it to a test server
 								- `--metrics`: use the specific metric supported in MMYOLO to evaluate, such as "proposal" in COCO format data.
 								- `--show`: show the evaluation result on screen or not
 								- `--show-dir`: save the evaluation result to this directory, valid only when specified
 								- `--show-score-thr`: show the threshold for the detected bboxes or not
 								- `--device`: indicate the device to run the model. Note that some backends limit the running devices. For example, TensorRT must run on CUDA
 								- `--cfg-options`: pass in additional configs, which will override the current deployment configs
 								- `--metric-options`: add custom options for metrics. The key-value pair format in xxx=yyy will be the kwargs of the dataset.evaluate() method
 								- `--log2file`: save the evaluation results (with the speed) to a file
 								- `--batch-size`: set the batch size for inference, which will override the `samples_per_gpu` in data config. The default value is `1`, however, not every model supports `batch_size > 1`
 								- `--speed-test`: test the inference speed or not
 								- `--warmup`: warm up before speed test or not, works only when `speed-test` is specified
 								- `--log-interval`: set the interval between each log, works only when `speed-test` is specified
 								Note: other parameters in `${MMDEPLOY_DIR}/tools/test.py` are used for speed test, they will not affect the evaluation results.