diff --git a/docs/benchmark.md b/docs/benchmark.md new file mode 100644 index 000000000..a580fd6f8 --- /dev/null +++ b/docs/benchmark.md @@ -0,0 +1,397 @@ +## Benchmark + +### Backends +CPU: ncnn, ONNXRuntime +GPU: TensorRT, ppl.nn + +### Platform +- Ubuntu 18.04 +- Cuda 11.3 +- TensorRT 7.2.3.4 +- Docker 20.10.8 +- NVIDIA tesla T4 tensor core GPU for TensorRT. + +### Other settings +- Static graph +- Batch size 1 +- Synchronize devices after each inference. +- We count the average inference performance of 100 images of the dataset. +- Warm up. For classification, we warm up 1010 iters. For other codebases, we warm up 10 iters. +- Input resolution varies for different datasets of different codebases. All inputs are real images except for mmediting because the dataset is not large enough. + +### Latency benchmark +Users can directly test the performance through [how_to_measure_performance_of_models.md](docs/tutorials/how_to_measure_performance_of_models.md). And here is the benchmark in our environment. +
+MMCls with 1x3x224x224 input +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
TensorRT
ModelInputfp32fp16in8model config file
latency (ms)FPSlatency (ms)FPSlatency (ms)FPS
ResNet1x3x224x2242.97336.901.26791.891.21829.66$MMCLS_DIR/configs/resnet/resnet50_b32x8_imagenet.py
ResNeXt1x3x224x2244.31231.931.42703.421.37727.42$MMCLS_DIR/configs/resnext/resnext50_32x4d_b32x8_imagenet.py
SE-ResNet1x3x224x2243.41293.641.66600.731.51662.90$MMCLS_DIR/configs/seresnet/seresnet50_b32x8_imagenet.py
ShuffleNetV21x3x224x2241.37727.941.19841.361.13883.47$MMCLS_DIR/configs/shufflenet_v2/shufflenet_v2_1x_b64x16_linearlr_bn_nowd_imagenet.py
+
+
+ +
+MMediting with 1x3x32x32 input +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
TensorRT
ModelInputfp32fp16in8model config file
latency (ms)FPSlatency (ms)FPSlatency (ms)FPS
ESRGAN1x3x32x3212.6479.1412.4280.5012.4580.35$MMEDIT_DIR/configs/restorers/esrgan/esrgan_psnr_x4c64b23g32_g1_1000k_div2k.py
SRCNN1x3x32x320.701436.470.352836.620.263850.45$MMEDIT_DIR/configs/restorers/srcnn/srcnn_x4k915_g1_1000k_div2k.py
+
+
+ +
+MMSeg with 1x3x512x1024 input +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
TensorRT
ModelInputfp32fp16in8model config file
latency (ms)FPSlatency (ms)FPSlatency (ms)FPS
FCN1x3x512x1024128.427.7923.9741.7218.1355.15$MMSEG_DIR/configs/fcn/fcn_r50-d8_512x1024_40k_cityscapes.py
PSPNet1x3x512x1024119.778.3524.1041.4916.3361.23$MMSEG_DIR/configs/pspnet/pspnet_r50-d8_512x1024_80k_cityscapes.py
DeepLabV31x3x512x1024226.754.4131.8031.4519.8550.38$MMSEG_DIR/configs/deeplabv3/deeplabv3_r50-d8_512x1024_80k_cityscapes.py
DeepLabV3+1x3x512x1024151.256.6147.0321.2650.3826.67$MMSEG_DIR/configs/deeplabv3plus/deeplabv3plus_r50-d8_512x1024_80k_cityscapes.py
+
+
+ +
+MMDet with 1x3x800x1344 input +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
TensorRT
ModelInputfp32fp16in8model config file
latency (ms)FPSlatency (ms)FPSlatency (ms)FPS
YOLOv31x3x800x134494.0810.6324.9040.1724.8740.21$MMDET_DIR/configs/yolo/yolov3_d53_320_273e_coco.py
SSD-Lite1x3x800x134414.9167.068.92112.138.65115.63$MMDET_DIR/configs/ssd/ssdlite_mobilenetv2_scratch_600e_coco.py
RetinaNet1x3x800x134497.0910.3025.7938.7816.8859.23$MMDET_DIR/configs/retinanet/retinanet_r50_fpn_1x_coco.py
FCOS1x3x800x134484.0611.9023.1543.2017.6856.57$MMDET_DIR/configs/fcos/fcos_r50_caffe_fpn_gn-head_1x_coco.py
FSAF1x3x800x134482.9612.0521.0247.5813.5074.08$MMDET_DIR/configs/fsaf/fsaf_r50_fpn_1x_coco.py
Faster-RCNN1x3x800x134488.0811.3526.5237.7019.1452.23$MMDET_DIR/configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py
Mask-RCNN1x3x800x1344320.86 3.12241.324.14--$MMDET_DIR/configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py
+
+
+ +
+MMOCR +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
TensorRT
ModelInputfp32fp16in8model config file
latency (ms)FPSlatency (ms)FPSlatency (ms)FPS
DBNet1x3x640x64010.7093.435.62177.785.00199.85$MMOCR_DIR/configs/textdet/dbnet/dbnet_r18_fpnc_1200e_icdar2015.py
CRNN1x1x32x321.93 518.281.40713.881.36736.79$MMOCR_DIR/configs/textrecog/crnn/crnn_academic_dataset.py
+
+