mmdeploy/benchmark.md at d7adf815a0e65303b78b226548da8ccf30f7d103

mirror of https://github.com/open-mmlab/mmdeploy.git synced 2025-01-14 08:09:43 +08:00

* bump version to v0.4.0

* [Enhancement] Make rewriter more powerful (#150)

* Finish function tests

* lint

* resolve comments

* Fix tests

* docstring & fix

* Complement informations

* lint

* Add example

* Fix version

* Remove todo

Co-authored-by: RunningLeon <mnsheng@yeah.net>

* Torchscript support (#159)

* support torchscript

* add nms

* add torchscript configs and update deploy process and dump-info

* typescript -> torchscript

* add torchscript custom extension support

* add ts custom ops again

* support mmseg unet

* [WIP] add optimizer for torchscript (#119)

* add passes

* add python api

* Torchscript optimizer python api (#121)

* add passes

* add python api

* use python api instead of executable

* Merge Master, update optimizer (#151)

* [Feature] add yolox ncnn (#29)

* add yolox ncnn

* add ncnn android performance of yolox

* add ut

* fix lint

* fix None bugs for ncnn

* test codecov

* test codecov

* add device

* fix yapf

* remove if-else for img shape

* use channelshuffle optimize

* change benchmark after channelshuffle

* fix yapf

* fix yapf

* fuse continuous reshape

* fix static shape deploy

* fix code

* drop pad

* only static shape

* fix static

* fix docstring

* Added mask overlay to output image, changed fprintf info messages to … (#55)

* Added mask overlay to output image, changed fprintf info messages to stdout

* Improved box filtering (filter area/score), make sure roi coordinates stay within bounds

* clang-format

* Support UNet in mmseg (#77)

* Repeatdataset in train has no CLASSES & PALETTE

* update result for unet

* update docstring for mmdet

* remove ppl for unet in docs

* fix ort wrap about input type (#81)

* Fix memleak (#86)

* delete []

* fix build error when enble MMDEPLOY_ACTIVE_LEVEL

* fix lint

* [Doc] Nano benchmark and tutorial (#71)

* add cls benchmark

* add nano zh-cn benchmark and en tutorial

* add device row

* add doc path to index.rst

* fix typo

* [Fix] fix missing deploy_core (#80)

* fix missing deploy_core

* mv flag to demo

* target link

* [Docs] Fix links in Chinese doc (#84)

* Fix docs in Chinese link

* Fix links

* Delete symbolic link and add links to html

* delete files

* Fix link

* [Feature] Add docker files (#67)

* add gpu and cpu dockerfile

* fix lint

* fix cpu docker and remove redundant

* use pip instead

* add build arg and readme

* fix grammar

* update readme

* add chinese doc for dockerfile and add docker build to build.md

* grammar

* refine dockerfiles

* add FAQs

* update Dpplcv_DIR for SDK building

* remove mmcls

* add sdk demos

* fix typo and lint

* update FAQs

* [Fix]fix check_env (#101)

* fix check_env

* update

* Replace convert_syncbatchnorm in mmseg (#93)

* replace convert_syncbatchnorm with revert_sync_batchnorm from mmcv

* change logger

* [Doc] Update FAQ for TensorRT (#96)

* update FAQ

* comment

* [Docs]: Update doc for openvino installation (#102)

* fix docs

* fix docs

* fix docs

* fix mmcv version

* fix docs

* rm blank line

* simplify non batch nms (#99)

* [Enhacement] Allow test.py to save evaluation results (#108)

* Add log file

* Delete debug code

* Rename logger

* resolve comments

* [Enhancement] Support mmocr v0.4+ (#115)

* support mmocr v0.4+

* 0.4.0 -> 0.4.1

* fix onnxruntime wrapper for gpu inference (#123)

* fix ncnn wrapper for ort-gpu

* resolve comment

* fix lint

* Fix typo (#132)

* lock mmcls version (#131)

* [Enhancement] upgrade isort in pre-commit config (#141)

* [Enhancement] upgrade isort in pre-commit config by refering to mmflow pr #87

* fix lint

* remove .isort.cfg and put its known_third_party to setup.cfg

* Fix ci for mmocr (#144)

* fix mmocr unittests

* remove useless

* lock mmdet maximum version to 2.20

* pip install -U numpy

* Fix capture_output (#125)

Co-authored-by: hanrui1sensetime <83800577+hanrui1sensetime@users.noreply.github.com>
Co-authored-by: Johannes L <tehkillerbee@users.noreply.github.com>
Co-authored-by: RunningLeon <mnsheng@yeah.net>
Co-authored-by: VVsssssk <88368822+VVsssssk@users.noreply.github.com>
Co-authored-by: lvhan028 <lvhan_028@163.com>
Co-authored-by: AllentDan <41138331+AllentDan@users.noreply.github.com>
Co-authored-by: Yifan Zhou <singlezombie@163.com>
Co-authored-by: 杨培文 (Yang Peiwen) <915505626@qq.com>
Co-authored-by: Semyon Bevzyuk <semen.bevzuk@gmail.com>

* configs for all tasks

* use torchvision roi align

* remote unnecessary code

* fix ut

* fix ut

* export

* det dynamic

* det dynamic

* add ut

* fix ut

* add ut and docs

* fix ut

* skip torchscript ut if no ops available

* add torchscript option to build.md

* update benchmark and resolve comments

* resolve conflicts

* rename configs

* fix mrcnn cuda test

* remove useless

* add version requirements to docs and comments to codes

* enable empty image exporting for torchscript and accelerate ORT inference for MRCNN

* rebase

* update example for torchscript.md

* update FAQs for torchscript.md

* resolve comments

* only use torchvision roi_align for torchscript

* fix ut

* use torchvision roi align when pool model is avg

* resolve comments

Co-authored-by: grimoire <streetyao@live.com>
Co-authored-by: grimoire <yaoqian@sensetime.com>
Co-authored-by: hanrui1sensetime <83800577+hanrui1sensetime@users.noreply.github.com>
Co-authored-by: Johannes L <tehkillerbee@users.noreply.github.com>
Co-authored-by: RunningLeon <mnsheng@yeah.net>
Co-authored-by: VVsssssk <88368822+VVsssssk@users.noreply.github.com>
Co-authored-by: lvhan028 <lvhan_028@163.com>
Co-authored-by: Yifan Zhou <singlezombie@163.com>
Co-authored-by: 杨培文 (Yang Peiwen) <915505626@qq.com>
Co-authored-by: Semyon Bevzyuk <semen.bevzuk@gmail.com>

* Update supported mmseg models (#181)

* fix ocrnet cascade decoder

* update mmseg support models

* update mmseg configs

* support emanet and icnet

* set max K of TopK for tensorrt

* update supported models for mmseg in docs

* add test for emamodule

* add configs and update docs

* Update docs

* update benchmark

* [Features]Support mmdet3d (#103)

* add mmdet3d code

* add code

* update code

* [log]This commit finish pointpillar export and evaluate on onnxruntime.The model is sample with nvidia repo model

* add tensorrt config

* fix config

* update

* support for tensorrt

* add config

* fix config`

* fix apis about torch2onnx

* update

* mmdet3d deploy version1.0

* map is ok

* fix code

* version1.0

* fix code

* fix visual

* fix bug

* tensorrt support success

* add docstring

* add docs

* fix docs

* fix comments

* fix comment

* fix comment

* fix openvino wrapper

* add unit test

* fix device about cpu

* fix comment

* fix show_result

* fix lint

* fix requirments

* remove ci about det3d

* fix ut

* add ut data

* support for new version pointpillars

* fix comment

* fix support_list

* fix comments

* fix config name

* [Enhancement] Update pad logic in detection heads (#168)

* pad with register

* fix lint

Co-authored-by: AllentDan <dongchunyu@sensetime.com>

* [Enhancement] Additional arguments support for OpenVINO Model Optimizer (#178)

* Add mo args.

* [Docs]: update docs and argument descriptions (#196)

* bump version to v0.4.0

* update docs and argument descriptions

* revert version change

* fix unnecessary change of config for dynamic exportation (#199)

* fix mmcls get classes (#215)

* fix mmcls get classes

* resolve comment

* resolve comment

* Add ModelOptimizerOptions.

* Fix merge bugs.

* Update mmpose.md (#224)

* [Dostring]add example in apis docstring (#214)

* add example in apis docstring

* add backend example in docstring

* rm blank line

* Fixed get_mo_options_from_cfg args

* fix l2norm test

Co-authored-by: RunningLeon <mnsheng@yeah.net>
Co-authored-by: Haofan Wang <frankmiracle@outlook.com>
Co-authored-by: VVsssssk <88368822+VVsssssk@users.noreply.github.com>
Co-authored-by: grimoire <yaoqian@sensetime.com>

* [Enhancement] Switch to statically typed Value::Any (#209)

* replace std::any with StaticAny

* fix __compare_typeid

* remove fallback id support

* constraint on traits::TypeId<T>::value

* fix includes

* [Enhancement] TensorRT DCN support (#205)

* add tensorrt dcn support

* fix lint

* remove roi_align plugin for ORT (#258)

* remove roi_align plugin

* remove ut

* skip single_roi_extractor UT for ORT in CI

* move align to symbolic and update docs

* recover UT

* resolve comments

* [Enhancement]: Support fcn_unet deployment with dynamic shape (#251)

* support mmseg fcn+unet dynamic shape

* add test

* fix ci

* fix units

* resolve comments

* [Enhancement] fix-cmake-relocatable (#223)

* require user to specify xxx_dir

* fix line ending

* fix end-of-file-fixer

* try to fix ld cudart cublas

* add ENV var search

* fix CMAKE_CUDA_COMPILER

* cpu, cuda should all work well

* remove commented code

* fix ncnn example find ncnn package (#282)

* table format is wrong (#283)

* update pre-commit (#284)

* update pre-commit

* fix clang-format

* fix mmseg config (#281)

* fix mmseg config

* fix mmpose evaluate outputs

* fix lint

* update pre-commit config

* fix lint

* Revert "update pre-commit config"

This reverts commit c3fd71611f0b79dfa9ad73fc0f4555c1b3563665.

* miss code symbol (#296)

* refactor cmake build (#295)

* add-mmpose-sdk (#259)

* Torchscript support (#159)

* support torchscript

* add nms

* add torchscript configs and update deploy process and dump-info

* typescript -> torchscript

* add torchscript custom extension support

* add ts custom ops again

* support mmseg unet

* [WIP] add optimizer for torchscript (#119)

* add passes

* add python api

* Torchscript optimizer python api (#121)

* add passes

* add python api

* use python api instead of executable

* Merge Master, update optimizer (#151)

* [Feature] add yolox ncnn (#29)

* add yolox ncnn

* add ncnn android performance of yolox

* add ut

* fix lint

* fix None bugs for ncnn

* test codecov

* test codecov

* add device

* fix yapf

* remove if-else for img shape

* use channelshuffle optimize

* change benchmark after channelshuffle

* fix yapf

* fix yapf

* fuse continuous reshape

* fix static shape deploy

* fix code

* drop pad

* only static shape

* fix static

* fix docstring

* Added mask overlay to output image, changed fprintf info messages to … (#55)

* Added mask overlay to output image, changed fprintf info messages to stdout

* Improved box filtering (filter area/score), make sure roi coordinates stay within bounds

* clang-format

* Support UNet in mmseg (#77)

* Repeatdataset in train has no CLASSES & PALETTE

* update result for unet

* update docstring for mmdet

* remove ppl for unet in docs

* fix ort wrap about input type (#81)

* Fix memleak (#86)

* delete []

* fix build error when enble MMDEPLOY_ACTIVE_LEVEL

* fix lint

* [Doc] Nano benchmark and tutorial (#71)

* add cls benchmark

* add nano zh-cn benchmark and en tutorial

* add device row

* add doc path to index.rst

* fix typo

* [Fix] fix missing deploy_core (#80)

* fix missing deploy_core

* mv flag to demo

* target link

* [Docs] Fix links in Chinese doc (#84)

* Fix docs in Chinese link

* Fix links

* Delete symbolic link and add links to html

* delete files

* Fix link

* [Feature] Add docker files (#67)

* add gpu and cpu dockerfile

* fix lint

* fix cpu docker and remove redundant

* use pip instead

* add build arg and readme

* fix grammar

* update readme

* add chinese doc for dockerfile and add docker build to build.md

* grammar

* refine dockerfiles

* add FAQs

* update Dpplcv_DIR for SDK building

* remove mmcls

* add sdk demos

* fix typo and lint

* update FAQs

* [Fix]fix check_env (#101)

* fix check_env

* update

* Replace convert_syncbatchnorm in mmseg (#93)

* replace convert_syncbatchnorm with revert_sync_batchnorm from mmcv

* change logger

* [Doc] Update FAQ for TensorRT (#96)

* update FAQ

* comment

* [Docs]: Update doc for openvino installation (#102)

* fix docs

* fix docs

* fix docs

* fix mmcv version

* fix docs

* rm blank line

* simplify non batch nms (#99)

* [Enhacement] Allow test.py to save evaluation results (#108)

* Add log file

* Delete debug code

* Rename logger

* resolve comments

* [Enhancement] Support mmocr v0.4+ (#115)

* support mmocr v0.4+

* 0.4.0 -> 0.4.1

* fix onnxruntime wrapper for gpu inference (#123)

* fix ncnn wrapper for ort-gpu

* resolve comment

* fix lint

* Fix typo (#132)

* lock mmcls version (#131)

* [Enhancement] upgrade isort in pre-commit config (#141)

* [Enhancement] upgrade isort in pre-commit config by refering to mmflow pr #87

* fix lint

* remove .isort.cfg and put its known_third_party to setup.cfg

* Fix ci for mmocr (#144)

* fix mmocr unittests

* remove useless

* lock mmdet maximum version to 2.20

* pip install -U numpy

* Fix capture_output (#125)

Co-authored-by: hanrui1sensetime <83800577+hanrui1sensetime@users.noreply.github.com>
Co-authored-by: Johannes L <tehkillerbee@users.noreply.github.com>
Co-authored-by: RunningLeon <mnsheng@yeah.net>
Co-authored-by: VVsssssk <88368822+VVsssssk@users.noreply.github.com>
Co-authored-by: lvhan028 <lvhan_028@163.com>
Co-authored-by: AllentDan <41138331+AllentDan@users.noreply.github.com>
Co-authored-by: Yifan Zhou <singlezombie@163.com>
Co-authored-by: 杨培文 (Yang Peiwen) <915505626@qq.com>
Co-authored-by: Semyon Bevzyuk <semen.bevzuk@gmail.com>

* configs for all tasks

* use torchvision roi align

* remote unnecessary code

* fix ut

* fix ut

* export

* det dynamic

* det dynamic

* add ut

* fix ut

* add ut and docs

* fix ut

* skip torchscript ut if no ops available

* add torchscript option to build.md

* update benchmark and resolve comments

* resolve conflicts

* rename configs

* fix mrcnn cuda test

* remove useless

* add version requirements to docs and comments to codes

* enable empty image exporting for torchscript and accelerate ORT inference for MRCNN

* rebase

* update example for torchscript.md

* update FAQs for torchscript.md

* resolve comments

* only use torchvision roi_align for torchscript

* fix ut

* use torchvision roi align when pool model is avg

* resolve comments

Co-authored-by: grimoire <streetyao@live.com>
Co-authored-by: grimoire <yaoqian@sensetime.com>
Co-authored-by: hanrui1sensetime <83800577+hanrui1sensetime@users.noreply.github.com>
Co-authored-by: Johannes L <tehkillerbee@users.noreply.github.com>
Co-authored-by: RunningLeon <mnsheng@yeah.net>
Co-authored-by: VVsssssk <88368822+VVsssssk@users.noreply.github.com>
Co-authored-by: lvhan028 <lvhan_028@163.com>
Co-authored-by: Yifan Zhou <singlezombie@163.com>
Co-authored-by: 杨培文 (Yang Peiwen) <915505626@qq.com>
Co-authored-by: Semyon Bevzyuk <semen.bevzuk@gmail.com>

* Update supported mmseg models (#181)

* fix ocrnet cascade decoder

* update mmseg support models

* update mmseg configs

* support emanet and icnet

* set max K of TopK for tensorrt

* update supported models for mmseg in docs

* add test for emamodule

* add configs and update docs

* Update docs

* update benchmark

* [Features]Support mmdet3d (#103)

* add mmdet3d code

* add code

* update code

* [log]This commit finish pointpillar export and evaluate on onnxruntime.The model is sample with nvidia repo model

* add tensorrt config

* fix config

* update

* support for tensorrt

* add config

* fix config`

* fix apis about torch2onnx

* update

* mmdet3d deploy version1.0

* map is ok

* fix code

* version1.0

* fix code

* fix visual

* fix bug

* tensorrt support success

* add docstring

* add docs

* fix docs

* fix comments

* fix comment

* fix comment

* fix openvino wrapper

* add unit test

* fix device about cpu

* fix comment

* fix show_result

* fix lint

* fix requirments

* remove ci about det3d

* fix ut

* add ut data

* support for new version pointpillars

* fix comment

* fix support_list

* fix comments

* fix config name

* [Enhancement] Additional arguments support for OpenVINO Model Optimizer (#178)

* Add mo args.

* [Docs]: update docs and argument descriptions (#196)

* bump version to v0.4.0

* update docs and argument descriptions

* revert version change

* fix unnecessary change of config for dynamic exportation (#199)

* fix mmcls get classes (#215)

* fix mmcls get classes

* resolve comment

* resolve comment

* Add ModelOptimizerOptions.

* Fix merge bugs.

* Update mmpose.md (#224)

* [Dostring]add example in apis docstring (#214)

* add example in apis docstring

* add backend example in docstring

* rm blank line

* Fixed get_mo_options_from_cfg args

* fix l2norm test

Co-authored-by: RunningLeon <mnsheng@yeah.net>
Co-authored-by: Haofan Wang <frankmiracle@outlook.com>
Co-authored-by: VVsssssk <88368822+VVsssssk@users.noreply.github.com>
Co-authored-by: grimoire <yaoqian@sensetime.com>

* add-mmpose-codebase

* fix ci

* fix img_shape after TopDownAffine

* rename TopDown module -> XheadDecode & implement regression decode

* align keypoints_from_heatmap

* remove hardcode keypoint_head, need refactor, current only support topdown config

* add mmpose python api

* update mmpose-python code

* can't clip fake box

* fix rebase error

* fix rebase error

* link mspn decoder to base decoder

* fix ci

* compile with gcc7.5

* remove no use code

* fix

* fix prompt

* remove unnecessary cv::parallel_for_

* rewrite TopdownHeatmapMultiStageHead.inference_model

* add comment

* add more detail docstring why use _cs2xyxy in sdk backend

* fix Registry name

* remove no use param & add comment of output result

Co-authored-by: AllentDan <41138331+AllentDan@users.noreply.github.com>
Co-authored-by: grimoire <streetyao@live.com>
Co-authored-by: grimoire <yaoqian@sensetime.com>
Co-authored-by: hanrui1sensetime <83800577+hanrui1sensetime@users.noreply.github.com>
Co-authored-by: Johannes L <tehkillerbee@users.noreply.github.com>
Co-authored-by: RunningLeon <mnsheng@yeah.net>
Co-authored-by: VVsssssk <88368822+VVsssssk@users.noreply.github.com>
Co-authored-by: lvhan028 <lvhan_028@163.com>
Co-authored-by: Yifan Zhou <singlezombie@163.com>
Co-authored-by: 杨培文 (Yang Peiwen) <915505626@qq.com>
Co-authored-by: Semyon Bevzyuk <semen.bevzuk@gmail.com>
Co-authored-by: Haofan Wang <frankmiracle@outlook.com>

* update faq about WinError 1455 (#297)

* update faq about WinError 1455

* Update faq.md

* Update faq.md

* fix ci

Co-authored-by: chenxin2 <chenxin2@sensetime.com>

* [Feature]Support centerpoint (#252)

* bump version to v0.4.0

* [Enhancement] Make rewriter more powerful (#150)

* Finish function tests

* lint

* resolve comments

* Fix tests

* docstring & fix

* Complement informations

* lint

* Add example

* Fix version

* Remove todo

Co-authored-by: RunningLeon <mnsheng@yeah.net>

* Torchscript support (#159)

* support torchscript

* add nms

* add torchscript configs and update deploy process and dump-info

* typescript -> torchscript

* add torchscript custom extension support

* add ts custom ops again

* support mmseg unet

* [WIP] add optimizer for torchscript (#119)

* add passes

* add python api

* Torchscript optimizer python api (#121)

* add passes

* add python api

* use python api instead of executable

* Merge Master, update optimizer (#151)

* [Feature] add yolox ncnn (#29)

* add yolox ncnn

* add ncnn android performance of yolox

* add ut

* fix lint

* fix None bugs for ncnn

* test codecov

* test codecov

* add device

* fix yapf

* remove if-else for img shape

* use channelshuffle optimize

* change benchmark after channelshuffle

* fix yapf

* fix yapf

* fuse continuous reshape

* fix static shape deploy

* fix code

* drop pad

* only static shape

* fix static

* fix docstring

* Added mask overlay to output image, changed fprintf info messages to … (#55)

* Added mask overlay to output image, changed fprintf info messages to stdout

* Improved box filtering (filter area/score), make sure roi coordinates stay within bounds

* clang-format

* Support UNet in mmseg (#77)

* Repeatdataset in train has no CLASSES & PALETTE

* update result for unet

* update docstring for mmdet

* remove ppl for unet in docs

* fix ort wrap about input type (#81)

* Fix memleak (#86)

* delete []

* fix build error when enble MMDEPLOY_ACTIVE_LEVEL

* fix lint

* [Doc] Nano benchmark and tutorial (#71)

* add cls benchmark

* add nano zh-cn benchmark and en tutorial

* add device row

* add doc path to index.rst

* fix typo

* [Fix] fix missing deploy_core (#80)

* fix missing deploy_core

* mv flag to demo

* target link

* [Docs] Fix links in Chinese doc (#84)

* Fix docs in Chinese link

* Fix links

* Delete symbolic link and add links to html

* delete files

* Fix link

* [Feature] Add docker files (#67)

* add gpu and cpu dockerfile

* fix lint

* fix cpu docker and remove redundant

* use pip instead

* add build arg and readme

* fix grammar

* update readme

* add chinese doc for dockerfile and add docker build to build.md

* grammar

* refine dockerfiles

* add FAQs

* update Dpplcv_DIR for SDK building

* remove mmcls

* add sdk demos

* fix typo and lint

* update FAQs

* [Fix]fix check_env (#101)

* fix check_env

* update

* Replace convert_syncbatchnorm in mmseg (#93)

* replace convert_syncbatchnorm with revert_sync_batchnorm from mmcv

* change logger

* [Doc] Update FAQ for TensorRT (#96)

* update FAQ

* comment

* [Docs]: Update doc for openvino installation (#102)

* fix docs

* fix docs

* fix docs

* fix mmcv version

* fix docs

* rm blank line

* simplify non batch nms (#99)

* [Enhacement] Allow test.py to save evaluation results (#108)

* Add log file

* Delete debug code

* Rename logger

* resolve comments

* [Enhancement] Support mmocr v0.4+ (#115)

* support mmocr v0.4+

* 0.4.0 -> 0.4.1

* fix onnxruntime wrapper for gpu inference (#123)

* fix ncnn wrapper for ort-gpu

* resolve comment

* fix lint

* Fix typo (#132)

* lock mmcls version (#131)

* [Enhancement] upgrade isort in pre-commit config (#141)

* [Enhancement] upgrade isort in pre-commit config by refering to mmflow pr #87

* fix lint

* remove .isort.cfg and put its known_third_party to setup.cfg

* Fix ci for mmocr (#144)

* fix mmocr unittests

* remove useless

* lock mmdet maximum version to 2.20

* pip install -U numpy

* Fix capture_output (#125)

Co-authored-by: hanrui1sensetime <83800577+hanrui1sensetime@users.noreply.github.com>
Co-authored-by: Johannes L <tehkillerbee@users.noreply.github.com>
Co-authored-by: RunningLeon <mnsheng@yeah.net>
Co-authored-by: VVsssssk <88368822+VVsssssk@users.noreply.github.com>
Co-authored-by: lvhan028 <lvhan_028@163.com>
Co-authored-by: AllentDan <41138331+AllentDan@users.noreply.github.com>
Co-authored-by: Yifan Zhou <singlezombie@163.com>
Co-authored-by: 杨培文 (Yang Peiwen) <915505626@qq.com>
Co-authored-by: Semyon Bevzyuk <semen.bevzuk@gmail.com>

* configs for all tasks

* use torchvision roi align

* remote unnecessary code

* fix ut

* fix ut

* export

* det dynamic

* det dynamic

* add ut

* fix ut

* add ut and docs

* fix ut

* skip torchscript ut if no ops available

* add torchscript option to build.md

* update benchmark and resolve comments

* resolve conflicts

* rename configs

* fix mrcnn cuda test

* remove useless

* add version requirements to docs and comments to codes

* enable empty image exporting for torchscript and accelerate ORT inference for MRCNN

* rebase

* update example for torchscript.md

* update FAQs for torchscript.md

* resolve comments

* only use torchvision roi_align for torchscript

* fix ut

* use torchvision roi align when pool model is avg

* resolve comments

Co-authored-by: grimoire <streetyao@live.com>
Co-authored-by: grimoire <yaoqian@sensetime.com>
Co-authored-by: hanrui1sensetime <83800577+hanrui1sensetime@users.noreply.github.com>
Co-authored-by: Johannes L <tehkillerbee@users.noreply.github.com>
Co-authored-by: RunningLeon <mnsheng@yeah.net>
Co-authored-by: VVsssssk <88368822+VVsssssk@users.noreply.github.com>
Co-authored-by: lvhan028 <lvhan_028@163.com>
Co-authored-by: Yifan Zhou <singlezombie@163.com>
Co-authored-by: 杨培文 (Yang Peiwen) <915505626@qq.com>
Co-authored-by: Semyon Bevzyuk <semen.bevzuk@gmail.com>

* Update supported mmseg models (#181)

* fix ocrnet cascade decoder

* update mmseg support models

* update mmseg configs

* support emanet and icnet

* set max K of TopK for tensorrt

* update supported models for mmseg in docs

* add test for emamodule

* add configs and update docs

* Update docs

* update benchmark

* [Features]Support mmdet3d (#103)

* add mmdet3d code

* add code

* update code

* [log]This commit finish pointpillar export and evaluate on onnxruntime.The model is sample with nvidia repo model

* add tensorrt config

* fix config

* update

* support for tensorrt

* add config

* fix config`

* fix apis about torch2onnx

* update

* mmdet3d deploy version1.0

* map is ok

* fix code

* version1.0

* fix code

* fix visual

* fix bug

* tensorrt support success

* add docstring

* add docs

* fix docs

* fix comments

* fix comment

* fix comment

* fix openvino wrapper

* add unit test

* fix device about cpu

* fix comment

* fix show_result

* fix lint

* fix requirments

* remove ci about det3d

* fix ut

* add ut data

* support for new version pointpillars

* fix comment

* fix support_list

* fix comments

* fix config name

* [Enhancement] Update pad logic in detection heads (#168)

* pad with register

* fix lint

Co-authored-by: AllentDan <dongchunyu@sensetime.com>

* [Enhancement] Additional arguments support for OpenVINO Model Optimizer (#178)

* Add mo args.

* [Docs]: update docs and argument descriptions (#196)

* bump version to v0.4.0

* update docs and argument descriptions

* revert version change

* fix unnecessary change of config for dynamic exportation (#199)

* fix mmcls get classes (#215)

* fix mmcls get classes

* resolve comment

* resolve comment

* Add ModelOptimizerOptions.

* Fix merge bugs.

* Update mmpose.md (#224)

* [Dostring]add example in apis docstring (#214)

* add example in apis docstring

* add backend example in docstring

* rm blank line

* Fixed get_mo_options_from_cfg args

* fix l2norm test

Co-authored-by: RunningLeon <mnsheng@yeah.net>
Co-authored-by: Haofan Wang <frankmiracle@outlook.com>
Co-authored-by: VVsssssk <88368822+VVsssssk@users.noreply.github.com>
Co-authored-by: grimoire <yaoqian@sensetime.com>

* [Enhancement] Switch to statically typed Value::Any (#209)

* replace std::any with StaticAny

* fix __compare_typeid

* remove fallback id support

* constraint on traits::TypeId<T>::value

* fix includes

* support for centerpoint

* [Enhancement] TensorRT DCN support (#205)

* add tensorrt dcn support

* fix lint

* add docstring and dcn model support

* add centerpoint ut and docs

* add config and fix input rank

* fix merge error

* fix a bug

* fix comment

* [Doc] update benchmark add supported-model-list (#286)

* update benchmark add supported-model-list

* fix lint

* fix lint

* loc mmocr maximum version

* fix ut

Co-authored-by: maningsheng <mnsheng@yeah.net>
Co-authored-by: Yifan Zhou <singlezombie@163.com>
Co-authored-by: AllentDan <41138331+AllentDan@users.noreply.github.com>
Co-authored-by: grimoire <streetyao@live.com>
Co-authored-by: grimoire <yaoqian@sensetime.com>
Co-authored-by: hanrui1sensetime <83800577+hanrui1sensetime@users.noreply.github.com>
Co-authored-by: Johannes L <tehkillerbee@users.noreply.github.com>
Co-authored-by: lvhan028 <lvhan_028@163.com>
Co-authored-by: 杨培文 (Yang Peiwen) <915505626@qq.com>
Co-authored-by: Semyon Bevzyuk <semen.bevzuk@gmail.com>
Co-authored-by: AllentDan <dongchunyu@sensetime.com>
Co-authored-by: Haofan Wang <frankmiracle@outlook.com>
Co-authored-by: lzhangzz <lzhang329@gmail.com>

Co-authored-by: maningsheng <mnsheng@yeah.net>
Co-authored-by: Yifan Zhou <singlezombie@163.com>
Co-authored-by: AllentDan <41138331+AllentDan@users.noreply.github.com>
Co-authored-by: grimoire <streetyao@live.com>
Co-authored-by: grimoire <yaoqian@sensetime.com>
Co-authored-by: hanrui1sensetime <83800577+hanrui1sensetime@users.noreply.github.com>
Co-authored-by: Johannes L <tehkillerbee@users.noreply.github.com>
Co-authored-by: VVsssssk <88368822+VVsssssk@users.noreply.github.com>
Co-authored-by: 杨培文 (Yang Peiwen) <915505626@qq.com>
Co-authored-by: Semyon Bevzyuk <semen.bevzuk@gmail.com>
Co-authored-by: AllentDan <dongchunyu@sensetime.com>
Co-authored-by: Haofan Wang <frankmiracle@outlook.com>
Co-authored-by: lzhangzz <lzhang329@gmail.com>
Co-authored-by: Chen Xin <xinchen.tju@gmail.com>
Co-authored-by: chenxin2 <chenxin2@sensetime.com>

2022-04-01 18:14:23 +08:00

58 KiB

Raw Blame History

Benchmark

Backends

CPU: ncnn, ONNXRuntime, OpenVINO

GPU: ncnn, TensorRT, PPLNN

Latency benchmark

Platform

Ubuntu 18.04
ncnn 20211208
Cuda 11.3
TensorRT 7.2.3.4
Docker 20.10.8
NVIDIA tesla T4 tensor core GPU for TensorRT.

Other settings

Static graph
Batch size 1
Synchronize devices after each inference.
We count the average inference performance of 100 images of the dataset.
Warm up. For ncnn, we warm up 30 iters for all codebases. As for other backends: for classification, we warm up 1010 iters; for other codebases, we warm up 10 iters.
Input resolution varies for different datasets of different codebases. All inputs are real images except for mmediting because the dataset is not large enough.

Users can directly test the speed through how_to_measure_performance_of_models.md. And here is the benchmark in our environment.

MMCls

MMCls			TensorRT												PPLNN		NCNN
Model	Dataset	Input	T4						JetsonNano2GB				Jetson TX2		T4		SnapDragon888		Adreno660		model config file
			fp32		fp16		int8		fp32		fp16		fp32		fp16		fp32		fp32
			latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS
ResNet	ImageNet	1x3x224x224	2.97	336.90	1.26	791.89	1.21	829.66	59.32	16.86	30.54	32.75	24.13	41.44	1.30	768.28	33.91	29.49	25.93	38.57	$MMCLS_DIR/configs/resnet/resnet50_b32x8_imagenet.py
ResNeXt	ImageNet	1x3x224x224	4.31	231.93	1.42	703.42	1.37	727.42	88.10	11.35	49.18	20.13	37.45	26.70	1.36	737.67	133.44	7.49	69.38	14.41	$MMCLS_DIR/configs/resnext/resnext50_32x4d_b32x8_imagenet.py
SE-ResNet	ImageNet	1x3x224x224	3.41	293.64	1.66	600.73	1.51	662.90	74.59	13.41	48.78	20.50	29.62	33.76	1.91	524.07	107.84	9.27	80.85	12.37	$MMCLS_DIR/configs/seresnet/seresnet50_b32x8_imagenet.py
ShuffleNetV2	ImageNet	1x3x224x224	1.37	727.94	1.19	841.36	1.13	883.47	15.26	65.54	10.23	97.77	7.37	135.73	4.69	213.33	9.55	104.71	10.66	93.81	$MMCLS_DIR/configs/shufflenet_v2/shufflenet_v2_1x_b64x16_linearlr_bn_nowd_imagenet.py

MMDet

MMDet			TensorRT								PPLNN
Model	Dataset	Input	T4						Jetson TX2		T4		model config file
			fp32		fp16		int8		fp32		fp16
			latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS
YOLOv3	COCO	1x3x320x320	14.76	67.76	24.92	40.13	24.92	40.13	-	-	18.07	55.35	$MMDET_DIR/configs/yolo/yolov3_d53_320_273e_coco.py
SSD-Lite	COCO	1x3x320x320	8.84	113.12	9.21	108.56	8.04	124.38	1.28	1.28	19.72	50.71	$MMDET_DIR/configs/ssd/ssdlite_mobilenetv2_scratch_600e_coco.py
RetinaNet	COCO	1x3x800x1344	97.09	10.30	25.79	38.78	16.88	59.23	780.48	1.28	38.34	26.08	$MMDET_DIR/configs/retinanet/retinanet_r50_fpn_1x_coco.py
FCOS	COCO	1x3x800x1344	84.06	11.90	23.15	43.20	17.68	56.57	-	-	-	-	$MMDET_DIR/configs/fcos/fcos_r50_caffe_fpn_gn-head_1x_coco.py
FSAF	COCO	1x3x800x1344	82.96	12.05	21.02	47.58	13.50	74.08	-	-	30.41	32.89	$MMDET_DIR/configs/fsaf/fsaf_r50_fpn_1x_coco.py
Faster-RCNN	COCO	1x3x800x1344	88.08	11.35	26.52	37.70	19.14	52.23	733.81	1.36	65.40	15.29	$MMDET_DIR/configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py
Mask-RCNN	COCO	1x3x800x1344	104.83	9.54	58.27	17.16	-	-	-	-	86.80	11.52	$MMDET_DIR/configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py

MMDet			NCNN
Model	Dataset	Input	SnapDragon888		Adreno660		model config file
			fp32		fp32
			latency (ms)	FPS	latency (ms)	FPS
MobileNetv2-YOLOv3	COCO	1x3x320x320	48.57	20.59	66.55	15.03	$MMDET_DIR/configs/yolo/yolov3_mobilenetv2_mstrain-416_300e_coco.py
SSD-Lite	COCO	1x3x320x320	44.91	22.27	66.19	15.11	$MMDET_DIR/configs/ssd/ssdlite_mobilenetv2_scratch_600e_coco.py
YOLOX	COCO	1x3x416x416	111.60	8.96	134.50	7.43	$MMDET_DIR/configs/yolox/yolox_tiny_8x8_300e_coco.py

MMEdit

MMEdit		TensorRT								PPLNN
Model	Input	T4						Jetson TX2		T4		model config file
		fp32		fp16		int8		fp32		fp16
		latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS
ESRGAN	1x3x32x32	12.64	79.14	12.42	80.50	12.45	80.35	-	-	7.67	130.39	$MMEDIT_DIR/configs/restorers/esrgan/esrgan_psnr_x4c64b23g32_g1_1000k_div2k.py
SRCNN	1x3x32x32	0.70	1436.47	0.35	2836.62	0.26	3850.45	58.86	16.99	0.56	1775.11	$MMEDIT_DIR/configs/restorers/srcnn/srcnn_x4k915_g1_1000k_div2k.py

MMOCR

MMOCR			TensorRT						PPLNN		NCNN
Model	Dataset	Input	T4						T4		SnapDragon888		Adreno660		model config file
			fp32		fp16		int8		fp16		fp32		fp32
			latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS
DBNet	ICDAR2015	1x3x640x640	10.70	93.43	5.62	177.78	5.00	199.85	34.84	28.70	-	-	-	-	$MMOCR_DIR/configs/textdet/dbnet/dbnet_r18_fpnc_1200e_icdar2015.py
CRNN	IIIT5K	1x1x32x32	1.93	518.28	1.40	713.88	1.36	736.79	-	-	10.57	94.64	20.00	50.00	$MMOCR_DIR/configs/textrecog/crnn/crnn_academic_dataset.py

MMSeg

MMSeg			TensorRT								PPLNN
Model	Dataset	Input	T4						Jetson TX2		T4		model config file
			fp32		fp16		int8		fp32		fp16
			latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS
FCN	Cityscapes	1x3x512x1024	128.42	7.79	23.97	41.72	18.13	55.15	1682.54	0.59	27.00	37.04	$MMSEG_DIR/configs/fcn/fcn_r50-d8_512x1024_40k_cityscapes.py
PSPNet	Cityscapes	1x3x512x1024	119.77	8.35	24.10	41.49	16.33	61.23	1586.19	0.63	27.26	36.69	$MMSEG_DIR/configs/pspnet/pspnet_r50-d8_512x1024_80k_cityscapes.py
DeepLabV3	Cityscapes	1x3x512x1024	226.75	4.41	31.80	31.45	19.85	50.38	-	-	36.01	27.77	$MMSEG_DIR/configs/deeplabv3/deeplabv3_r50-d8_512x1024_80k_cityscapes.py
DeepLabV3+	Cityscapes	1x3x512x1024	151.25	6.61	47.03	21.26	50.38	26.67	2534.96	0.39	34.80	28.74	$MMSEG_DIR/configs/deeplabv3plus/deeplabv3plus_r50-d8_512x1024_80k_cityscapes.py

Performance benchmark

Users can directly test the performance through how_to_evaluate_a_model.md. And here is the benchmark in our environment.