mmdeploy

35 KiB

Raw Blame History

Benchmark

Backends

CPU: ncnn, ONNXRuntime GPU: TensorRT, PPLNN

Platform

Ubuntu 18.04
Cuda 11.3
TensorRT 7.2.3.4
Docker 20.10.8
NVIDIA tesla T4 tensor core GPU for TensorRT.

Other settings

Static graph
Batch size 1
Synchronize devices after each inference.
We count the average inference performance of 100 images of the dataset.
Warm up. For classification, we warm up 1010 iters. For other codebases, we warm up 10 iters.
Input resolution varies for different datasets of different codebases. All inputs are real images except for mmediting because the dataset is not large enough.

Latency benchmark

Users can directly test the speed through how_to_measure_performance_of_models.md. And here is the benchmark in our environment.

MMCls with 1x3x224x224 input

			TensorRT						PPLNN
Model	Dataset	Input	fp32		fp16		in8		fp16		model config file
Model	Dataset	Input	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	model config file
ResNet	ImageNet	1x3x224x224	2.97	336.90	1.26	791.89	1.21	829.66	1.30	768.28	$MMCLS_DIR/configs/resnet/resnet50_b32x8_imagenet.py
ResNeXt	ImageNet	1x3x224x224	4.31	231.93	1.42	703.42	1.37	727.42	1.36	737.67	$MMCLS_DIR/configs/resnext/resnext50_32x4d_b32x8_imagenet.py
SE-ResNet	ImageNet	1x3x224x224	3.41	293.64	1.66	600.73	1.51	662.90	1.91	524.07	$MMCLS_DIR/configs/seresnet/seresnet50_b32x8_imagenet.py
ShuffleNetV2	ImageNet	1x3x224x224	1.37	727.94	1.19	841.36	1.13	883.47	4.69	213.33	$MMCLS_DIR/configs/shufflenet_v2/shufflenet_v2_1x_b64x16_linearlr_bn_nowd_imagenet.py

MMEditing with 1x3x32x32 input

		TensorRT						PPLNN
Model	Input	fp32		fp16		in8		fp16		model config file
Model	Input	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	model config file
ESRGAN	1x3x32x32	12.64	79.14	12.42	80.50	12.45	80.35	7.67	130.39	$MMEDIT_DIR/configs/restorers/esrgan/esrgan_psnr_x4c64b23g32_g1_1000k_div2k.py
SRCNN	1x3x32x32	0.70	1436.47	0.35	2836.62	0.26	3850.45	0.56	1775.11	$MMEDIT_DIR/configs/restorers/srcnn/srcnn_x4k915_g1_1000k_div2k.py

MMSeg with 1x3x512x1024 input

			TensorRT						PPLNN
Model	Dataset	Input	fp32		fp16		in8		fp16		model config file
Model	Dataset	Input	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	model config file
FCN	Cityscapes	1x3x512x1024	128.42	7.79	23.97	41.72	18.13	55.15	27.00	37.04	$MMSEG_DIR/configs/fcn/fcn_r50-d8_512x1024_40k_cityscapes.py
PSPNet	Cityscapes	1x3x512x1024	119.77	8.35	24.10	41.49	16.33	61.23	27.26	36.69	$MMSEG_DIR/configs/pspnet/pspnet_r50-d8_512x1024_80k_cityscapes.py
DeepLabV3	Cityscapes	1x3x512x1024	226.75	4.41	31.80	31.45	19.85	50.38	36.01	27.77	$MMSEG_DIR/configs/deeplabv3/deeplabv3_r50-d8_512x1024_80k_cityscapes.py
DeepLabV3+	Cityscapes	1x3x512x1024	151.25	6.61	47.03	21.26	50.38	26.67	34.80	28.74	$MMSEG_DIR/configs/deeplabv3plus/deeplabv3plus_r50-d8_512x1024_80k_cityscapes.py

MMDet with 1x3x800x1344 input

			TensorRT						PPLNN
Model	Dataset	Input	fp32		fp16		in8		fp16		model config file
Model	Dataset	Input	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	model config file
YOLOv3	COCO	1x3x800x1344	94.08	10.63	24.90	40.17	24.87	40.21	47.64	20.99	$MMDET_DIR/configs/yolo/yolov3_d53_320_273e_coco.py
SSD-Lite	COCO	1x3x800x1344	14.91	67.06	8.92	112.13	8.65	115.63	30.13	33.19	$MMDET_DIR/configs/ssd/ssdlite_mobilenetv2_scratch_600e_coco.py
RetinaNet	COCO	1x3x800x1344	97.09	10.30	25.79	38.78	16.88	59.23	38.34	26.08	$MMDET_DIR/configs/retinanet/retinanet_r50_fpn_1x_coco.py
FCOS	COCO	1x3x800x1344	84.06	11.90	23.15	43.20	17.68	56.57	-	-	$MMDET_DIR/configs/fcos/fcos_r50_caffe_fpn_gn-head_1x_coco.py
FSAF	COCO	1x3x800x1344	82.96	12.05	21.02	47.58	13.50	74.08	30.41	32.89	$MMDET_DIR/configs/fsaf/fsaf_r50_fpn_1x_coco.py
Faster-RCNN	COCO	1x3x800x1344	88.08	11.35	26.52	37.70	19.14	52.23	65.40	15.29	$MMDET_DIR/configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py
Mask-RCNN	COCO	1x3x800x1344	320.86	3.12	241.32	4.14	-	-	86.80	11.52	$MMDET_DIR/configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py

MMOCR

			TensorRT						PPLNN
Model	Dataset	Input	fp32		fp16		in8		fp16		model config file
Model	Dataset	Input		FPS	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	model config file
DBNet	ICDAR2015	1x3x640x640	10.70	93.43	5.62	177.78	5.00	199.85	34.84	28.70	$MMOCR_DIR/configs/textdet/dbnet/dbnet_r18_fpnc_1200e_icdar2015.py
CRNN	IIIT5K	1x1x32x32	1.93	518.28	1.40	713.88	1.36	736.79	-	-	$MMOCR_DIR/configs/textrecog/crnn/crnn_academic_dataset.py

Performance benchmark

Users can directly test the performance through how_to_evaluate_a_model.md. And here is the benchmark in our environment.

MMClassification

MMClassification			PyTorch	ONNX Runtime	TensorRT			PPLNN
Model	Task	Metrics	fp32	fp32	fp32	fp16	int8	fp16	model config file
ResNet-18	Classification	top-1	69.90	69.88	69.88	69.86	69.86	69.86	$MMCLS_DIR/configs/resnet/resnet18_b32x8_imagenet.py
ResNet-18	Classification	top-5	89.43	89.34	89.34	89.33	89.38	89.34	$MMCLS_DIR/configs/resnet/resnet18_b32x8_imagenet.py
ResNeXt-50	Classification	top-1	77.90	77.90	77.90	-	77.78	77.89	$MMCLS_DIR/configs/resnext/resnext50_32x4d_b32x8_imagenet.py
ResNeXt-50	Classification	top-5	93.66	93.66	93.66	-	93.64	93.65
SE-ResNet-50	Classification	top-1	77.74	77.74	77.74	77.75	77.63	77.73	$MMCLS_DIR/configs/resnext/resnext50_32x4d_b32x8_imagenet.py
SE-ResNet-50	Classification	top-5	93.84	93.84	93.84	93.83	93.72	93.84
ShuffleNetV1 1.0x	Classification	top-1	68.13	68.13	68.13	68.13	67.71	68.11	$MMCLS_DIR/configs/shufflenet_v1/shufflenet_v1_1x_b64x16_linearlr_bn_nowd_imagenet.py
ShuffleNetV1 1.0x	Classification	top-5	87.81	87.81	87.81	87.81	87.58	87.80
ShuffleNetV2 1.0x	Classification	top-1	69.55	69.55	69.55	69.54	69.10	69.54	$MMCLS_DIR/configs/shufflenet_v2/shufflenet_v2_1x_b64x16_linearlr_bn_nowd_imagenet.py
ShuffleNetV2 1.0x	Classification	top-5	88.92	88.92	88.92	88.91	88.58	88.92
MobileNet V2	Classification	top-1	71.86	71.86	71.86	71.87	70.91	71.84	$MMCLS_DIR/configs/mobilenet_v2/mobilenet_v2_b32x8_imagenet.py
MobileNet V2	Classification	top-5	90.42	90.42	90.42	90.40	89.85	90.41

MMEditing

MMEditing				PyTorch	ONNX Runtime	TensorRT			PPLNN
Model	Task	Dataset	Metrics	fp32	fp32	fp32	fp16	int8	fp16	model config file
SRCNN	Super Resolution	Set5	PSNR	28.4316	28.4323	28.4323	28.4286	28.1995	28.4311	$MMEDIT_DIR/configs/restorers/srcnn/srcnn_x4k915_g1_1000k_div2k.py
SRCNN	Super Resolution	Set5	SSIM	0.8099	0.8097	0.8097	0.8096	0.7934	0.8096
ESRGAN	Super Resolution	Set5	PSNR	28.2700	28.2592	28.2592	-	-	28.2624	$MMEDIT_DIR/configs/restorers/esrgan/esrgan_x4c64b23g32_g1_400k_div2k.py
ESRGAN	Super Resolution	Set5	SSIM	0.7778	0.7764	0.7774	-	-	0.7765
ESRGAN-PSNR	Super Resolution	Set5	PSNR	30.6428	30.6444	30.6430	-	-	27.0426	$MMEDIT_DIR/configs/restorers/esrgan/esrgan_psnr_x4c64b23g32_g1_1000k_div2k.py
ESRGAN-PSNR	Super Resolution	Set5	SSIM	0.8559	0.8558	0.8558	-	-	0.8557
SRGAN	Super Resolution	Set5	PSNR	27.9499	27.9408	27.9408	-	-	27.9388	$MMEDIT_DIR/configs/restorers/srresnet_srgan/srgan_x4c64b16_g1_1000k_div2k.pyy
SRGAN	Super Resolution	Set5	SSIM	0.7846	0.7839	0.7839	-	-	0.7839
SRResNet	Super Resolution	Set5	PSNR	30.2252	30.2300	30.2300	-	-	30.2294	$MMEDIT_DIR/configs/restorers/srresnet_srgan/msrresnet_x4c64b16_g1_1000k_div2k.py
SRResNet	Super Resolution	Set5	SSIM	0.8491	0.8488	0.8488	-	-	0.8488
Real-ESRNet	Super Resolution	Set5	PSNR	28.0297	27.7016	27.7016	-	-	27.7049	$MMEDIT_DIR/configs/restorers/real_esrgan/realesrnet_c64b23g32_12x4_lr2e-4_1000k_df2k_ost.py
Real-ESRNet	Super Resolution	Set5	SSIM	0.8236	0.8122	0.8122	-	-	0.8123
EDSR	Super Resolution	Set5	PSNR	30.2223	30.2214	30.2214	30.2211	30.1383	-	$MMEDIT_DIR/configs/restorers/edsr/edsr_x4c64b16_g1_300k_div2k.py
EDSR	Super Resolution	Set5	SSIM	0.8500	0.8497	0.8497	0.8497	0.8469	-