mmsegmentation/configs/bisenetv1
MengzhangLI bd1097ac02
[Fix] Fix several config file errors in 2.0 (New) (#1994)
* [Fix] Fix several config file errors in 2.0

* change _base_ config file name in configs
2022-08-30 20:20:05 +08:00
..
README.md [Refactor] Update config names (#1964) 2022-08-26 18:48:56 +08:00
bisenetv1.yml [Refactor] Update config names (#1964) 2022-08-26 18:48:56 +08:00
bisenetv1_r18-d32-in1k-pre_4xb4-160k_cityscapes-1024x1024.py [Refactor] Update config names (#1964) 2022-08-26 18:48:56 +08:00
bisenetv1_r18-d32-in1k-pre_4xb4-160k_coco-stuff164k-512x512.py [Refactor] Update config names (#1964) 2022-08-26 18:48:56 +08:00
bisenetv1_r18-d32-in1k-pre_4xb8-160k_cityscapes-1024x1024.py [Refactor] Update config names (#1964) 2022-08-26 18:48:56 +08:00
bisenetv1_r18-d32_4xb4-160k_cityscapes-1024x1024.py [Refactor] Update config names (#1964) 2022-08-26 18:48:56 +08:00
bisenetv1_r18-d32_4xb4-160k_coco-stuff164k-512x512.py [Fix] Fix several config file errors in 2.0 (New) (#1994) 2022-08-30 20:20:05 +08:00
bisenetv1_r50-d32-in1k-pre_4xb4-160k_cityscapes-1024x1024.py [Refactor] Update config names (#1964) 2022-08-26 18:48:56 +08:00
bisenetv1_r50-d32-in1k-pre_4xb4-160k_coco-stuff164k-512x512.py [Refactor] Update config names (#1964) 2022-08-26 18:48:56 +08:00
bisenetv1_r50-d32_4xb4-160k_cityscapes-1024x1024.py [Refactor] Update config names (#1964) 2022-08-26 18:48:56 +08:00
bisenetv1_r50-d32_4xb4-160k_coco-stuff164k-512x512.py [Fix] Fix several config file errors in 2.0 (New) (#1994) 2022-08-30 20:20:05 +08:00
bisenetv1_r101-d32-in1k-pre_4xb4-160k_coco-stuff164k-512x512.py [Refactor] Update config names (#1964) 2022-08-26 18:48:56 +08:00
bisenetv1_r101-d32_4xb4-160k_coco-stuff164k-512x512.py [Fix] Fix several config file errors in 2.0 (New) (#1994) 2022-08-30 20:20:05 +08:00

README.md

BiSeNetV1

BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation

Introduction

Official Repo

Code Snippet

Abstract

Semantic segmentation requires both rich spatial information and sizeable receptive field. However, modern approaches usually compromise spatial resolution to achieve real-time inference speed, which leads to poor performance. In this paper, we address this dilemma with a novel Bilateral Segmentation Network (BiSeNet). We first design a Spatial Path with a small stride to preserve the spatial information and generate high-resolution features. Meanwhile, a Context Path with a fast downsampling strategy is employed to obtain sufficient receptive field. On top of the two paths, we introduce a new Feature Fusion Module to combine features efficiently. The proposed architecture makes a right balance between the speed and segmentation performance on Cityscapes, CamVid, and COCO-Stuff datasets. Specifically, for a 2048x1024 input, we achieve 68.4% Mean IOU on the Cityscapes test dataset with speed of 105 FPS on one NVIDIA Titan XP card, which is significantly faster than the existing methods with comparable performance.

Citation

@inproceedings{yu2018bisenet,
  title={Bisenet: Bilateral segmentation network for real-time semantic segmentation},
  author={Yu, Changqian and Wang, Jingbo and Peng, Chao and Gao, Changxin and Yu, Gang and Sang, Nong},
  booktitle={Proceedings of the European conference on computer vision (ECCV)},
  pages={325--341},
  year={2018}
}

Results and models

Cityscapes

Method Backbone Crop Size Lr schd Mem (GB) Inf time (fps) mIoU mIoU(ms+flip) config download
BiSeNetV1 (No Pretrain) R-18-D32 1024x1024 160000 5.69 31.77 74.44 77.05 config model | log
BiSeNetV1 R-18-D32 1024x1024 160000 5.69 31.77 74.37 76.91 config model | log
BiSeNetV1 (4x8) R-18-D32 1024x1024 160000 11.17 31.77 75.16 77.24 config model | log
BiSeNetV1 (No Pretrain) R-50-D32 1024x1024 160000 15.39 7.71 76.92 78.87 config model | log
BiSeNetV1 R-50-D32 1024x1024 160000 15.39 7.71 77.68 79.57 config model | log

COCO-Stuff 164k

Method Backbone Crop Size Lr schd Mem (GB) Inf time (fps) mIoU mIoU(ms+flip) config download
BiSeNetV1 (No Pretrain) R-18-D32 512x512 160000 - - 25.45 26.15 config model | log
BiSeNetV1 R-18-D32 512x512 160000 6.33 74.24 28.55 29.26 config model | log
BiSeNetV1 (No Pretrain) R-50-D32 512x512 160000 - - 29.82 30.33 config model | log
BiSeNetV1 R-50-D32 512x512 160000 9.28 32.60 34.88 35.37 config model | log
BiSeNetV1 (No Pretrain) R-101-D32 512x512 160000 - - 31.14 31.76 config model | log
BiSeNetV1 R-101-D32 512x512 160000 10.36 25.25 37.38 37.99 config model | log

Note:

  • 4x8: Using 4 GPUs with 8 samples per GPU in training.
  • For BiSeNetV1 on Cityscapes dataset, default setting is 4 GPUs with 4 samples per GPU in training.
  • No Pretrain means the model is trained from scratch.