mmdeploy/benchmark.md at fce37d4594fff4bce49781ff8edaaf42c72dbbc7

mirror of https://github.com/open-mmlab/mmdeploy.git synced 2025-01-14 08:09:43 +08:00

AllentDan 199253ce94

* add to index

* fix link

* add chinese benchmark

* add openvino to backends

* add codebase name

* add benchmark to bechmark model zoo

* update chinese mmcls benchmark

2021-12-23 11:25:45 +08:00

34 KiB

Raw Blame History

Benchmark

Backends

CPU: ncnn, ONNXRuntime, OpenVINO

GPU: TensorRT, PPLNN

Latency benchmark

Platform

Ubuntu 18.04
Cuda 11.3
TensorRT 7.2.3.4
Docker 20.10.8
NVIDIA tesla T4 tensor core GPU for TensorRT.

Other settings

Static graph
Batch size 1
Synchronize devices after each inference.
We count the average inference performance of 100 images of the dataset.
Warm up. For classification, we warm up 1010 iters. For other codebases, we warm up 10 iters.
Input resolution varies for different datasets of different codebases. All inputs are real images except for mmediting because the dataset is not large enough.

Users can directly test the speed through how_to_measure_performance_of_models.md. And here is the benchmark in our environment.

MMCls

MMCls			TensorRT						PPLNN
Model	Dataset	Input	fp32		fp16		in8		fp16		model config file
Model	Dataset	Input	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	model config file
ResNet	ImageNet	1x3x224x224	2.97	336.90	1.26	791.89	1.21	829.66	1.30	768.28	$MMCLS_DIR/configs/resnet/resnet50_b32x8_imagenet.py
ResNeXt	ImageNet	1x3x224x224	4.31	231.93	1.42	703.42	1.37	727.42	1.36	737.67	$MMCLS_DIR/configs/resnext/resnext50_32x4d_b32x8_imagenet.py
SE-ResNet	ImageNet	1x3x224x224	3.41	293.64	1.66	600.73	1.51	662.90	1.91	524.07	$MMCLS_DIR/configs/seresnet/seresnet50_b32x8_imagenet.py
ShuffleNetV2	ImageNet	1x3x224x224	1.37	727.94	1.19	841.36	1.13	883.47	4.69	213.33	$MMCLS_DIR/configs/shufflenet_v2/shufflenet_v2_1x_b64x16_linearlr_bn_nowd_imagenet.py

MMDet

MMDet			TensorRT						PPLNN
Model	Dataset	Input	fp32		fp16		in8		fp16		model config file
Model	Dataset	Input	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	model config file
YOLOv3	COCO	1x3x800x1344	94.08	10.63	24.90	40.17	24.87	40.21	47.64	20.99	$MMDET_DIR/configs/yolo/yolov3_d53_320_273e_coco.py
SSD-Lite	COCO	1x3x800x1344	14.91	67.06	8.92	112.13	8.65	115.63	30.13	33.19	$MMDET_DIR/configs/ssd/ssdlite_mobilenetv2_scratch_600e_coco.py
RetinaNet	COCO	1x3x800x1344	97.09	10.30	25.79	38.78	16.88	59.23	38.34	26.08	$MMDET_DIR/configs/retinanet/retinanet_r50_fpn_1x_coco.py
FCOS	COCO	1x3x800x1344	84.06	11.90	23.15	43.20	17.68	56.57	-	-	$MMDET_DIR/configs/fcos/fcos_r50_caffe_fpn_gn-head_1x_coco.py
FSAF	COCO	1x3x800x1344	82.96	12.05	21.02	47.58	13.50	74.08	30.41	32.89	$MMDET_DIR/configs/fsaf/fsaf_r50_fpn_1x_coco.py
Faster-RCNN	COCO	1x3x800x1344	88.08	11.35	26.52	37.70	19.14	52.23	65.40	15.29	$MMDET_DIR/configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py
Mask-RCNN	COCO	1x3x800x1344	320.86	3.12	241.32	4.14	-	-	86.80	11.52	$MMDET_DIR/configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py

MMEdit

MMEdit		TensorRT						PPLNN
Model	Input	fp32		fp16		in8		fp16		model config file
Model	Input	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	model config file
ESRGAN	1x3x32x32	12.64	79.14	12.42	80.50	12.45	80.35	7.67	130.39	$MMEDIT_DIR/configs/restorers/esrgan/esrgan_psnr_x4c64b23g32_g1_1000k_div2k.py
SRCNN	1x3x32x32	0.70	1436.47	0.35	2836.62	0.26	3850.45	0.56	1775.11	$MMEDIT_DIR/configs/restorers/srcnn/srcnn_x4k915_g1_1000k_div2k.py

MMOCR

MMOCR			TensorRT						PPLNN
Model	Dataset	Input	fp32		fp16		in8		fp16		model config file
Model	Dataset	Input	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	model config file
YOLOv3	COCO	1x3x800x1344	94.08	10.63	24.90	40.17	24.87	40.21	47.64	20.99	$MMDET_DIR/configs/yolo/yolov3_d53_320_273e_coco.py
SSD-Lite	COCO	1x3x800x1344	14.91	67.06	8.92	112.13	8.65	115.63	30.13	33.19	$MMDET_DIR/configs/ssd/ssdlite_mobilenetv2_scratch_600e_coco.py
RetinaNet	COCO	1x3x800x1344	97.09	10.30	25.79	38.78	16.88	59.23	38.34	26.08	$MMDET_DIR/configs/retinanet/retinanet_r50_fpn_1x_coco.py
FCOS	COCO	1x3x800x1344	84.06	11.90	23.15	43.20	17.68	56.57	-	-	$MMDET_DIR/configs/fcos/fcos_r50_caffe_fpn_gn-head_1x_coco.py
FSAF	COCO	1x3x800x1344	82.96	12.05	21.02	47.58	13.50	74.08	30.41	32.89	$MMDET_DIR/configs/fsaf/fsaf_r50_fpn_1x_coco.py
Faster-RCNN	COCO	1x3x800x1344	88.08	11.35	26.52	37.70	19.14	52.23	65.40	15.29	$MMDET_DIR/configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py
Mask-RCNN	COCO	1x3x800x1344	320.86	3.12	241.32	4.14	-	-	86.80	11.52	$MMDET_DIR/configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py

MMSeg

MMSeg			TensorRT						PPLNN
Model	Dataset	Input	fp32		fp16		in8		fp16		model config file
Model	Dataset	Input	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	model config file
FCN	Cityscapes	1x3x512x1024	128.42	7.79	23.97	41.72	18.13	55.15	27.00	37.04	$MMSEG_DIR/configs/fcn/fcn_r50-d8_512x1024_40k_cityscapes.py
PSPNet	Cityscapes	1x3x512x1024	119.77	8.35	24.10	41.49	16.33	61.23	27.26	36.69	$MMSEG_DIR/configs/pspnet/pspnet_r50-d8_512x1024_80k_cityscapes.py
DeepLabV3	Cityscapes	1x3x512x1024	226.75	4.41	31.80	31.45	19.85	50.38	36.01	27.77	$MMSEG_DIR/configs/deeplabv3/deeplabv3_r50-d8_512x1024_80k_cityscapes.py
DeepLabV3+	Cityscapes	1x3x512x1024	151.25	6.61	47.03	21.26	50.38	26.67	34.80	28.74	$MMSEG_DIR/configs/deeplabv3plus/deeplabv3plus_r50-d8_512x1024_80k_cityscapes.py

Performance benchmark

Users can directly test the performance through how_to_evaluate_a_model.md. And here is the benchmark in our environment.