MengzhangLI 933e4d3cb6
[Feature] Support MaskFormer(NeurIPS'2021) in MMSeg 1.x (#2215)
* [Feature] Support MaskFormer(NeurIPS'2021) in MMSeg 1.x

* add mmdet try except logic

* refactor config files

* add readme

* fix config

* update models & logs

* add MMDET installation and fix info

* fix comments

* fix

* fix config norm optimizer setting

* update models & logs & unittest

* add docstring of MaskFormerHead

* wait for mmdet 3.0.0rc4

* replace seg_mask with seg_logits & add docstring for batch_input_shape

* use mmdet3.0.0rc4

* fix readme and modify config comments

* add mmdet installation in pr_stage_test.yml

* update mmcv version in pr_stage_test.yml

* add mmdet in build_cpu of pr_stage_test.yml

* modify mmdet& mmcv installation in merge_stage_test.yml

* fix typo

* update test.yml

* update test.yml
2022-12-01 19:03:10 +08:00

102 lines
3.3 KiB
YAML

Collections:
- Name: MaskFormer
Metadata:
Training Data:
- Usage
- ADE20K
Paper:
URL: https://arxiv.org/abs/2107.06278
Title: 'MaskFormer: Per-Pixel Classification is Not All You Need for Semantic
Segmentation'
README: configs/maskformer/README.md
Code:
URL: https://github.com/open-mmlab/mmdetection/blob/dev-3.x/mmdet/models/dense_heads/maskformer_head.py#L21
Version: dev-3.x
Converted From:
Code: https://github.com/facebookresearch/MaskFormer/
Models:
- Name: maskformer_r50-d32_8xb2-160k_ade20k-512x512
In Collection: MaskFormer
Metadata:
backbone: R-50-D32
crop size: (512,512)
lr schd: 160000
inference time (ms/im):
- value: 23.7
hardware: V100
backend: PyTorch
batch size: 1
mode: FP32
resolution: (512,512)
Training Memory (GB): 3.29
Results:
- Task: Semantic Segmentation
Dataset: ADE20K
Metrics:
mIoU: 44.29
Config: configs/maskformer/maskformer_r50-d32_8xb2-160k_ade20k-512x512.py
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/maskformer/maskformer_r50-d32_8xb2-160k_ade20k-512x512/maskformer_r50-d32_8xb2-160k_ade20k-512x512_20221030_182724-cbd39cc1.pth
- Name: maskformer_r101-d32_8xb2-160k_ade20k-512x512
In Collection: MaskFormer
Metadata:
backbone: R-101-D32
crop size: (512,512)
lr schd: 160000
inference time (ms/im):
- value: 28.65
hardware: V100
backend: PyTorch
batch size: 1
mode: FP32
resolution: (512,512)
Training Memory (GB): 4.12
Results:
- Task: Semantic Segmentation
Dataset: ADE20K
Metrics:
mIoU: 45.11
Config: configs/maskformer/maskformer_r101-d32_8xb2-160k_ade20k-512x512.py
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/maskformer/maskformer_r101-d32_8xb2-160k_ade20k-512x512/maskformer_r101-d32_8xb2-160k_ade20k-512x512_20221031_223053-c8e0931d.pth
- Name: maskformer_swin-t_upernet_8xb2-160k_ade20k-512x512
In Collection: MaskFormer
Metadata:
backbone: Swin-T
crop size: (512,512)
lr schd: 160000
inference time (ms/im):
- value: 24.67
hardware: V100
backend: PyTorch
batch size: 1
mode: FP32
resolution: (512,512)
Training Memory (GB): 3.73
Results:
- Task: Semantic Segmentation
Dataset: ADE20K
Metrics:
mIoU: 46.69
Config: configs/maskformer/maskformer_swin-t_upernet_8xb2-160k_ade20k-512x512.py
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/maskformer/maskformer_swin-t_upernet_8xb2-160k_ade20k-512x512/maskformer_swin-t_upernet_8xb2-160k_ade20k-512x512_20221114_232813-03550716.pth
- Name: maskformer_swin-s_upernet_8xb2-160k_ade20k-512x512
In Collection: MaskFormer
Metadata:
backbone: Swin-S
crop size: (512,512)
lr schd: 160000
inference time (ms/im):
- value: 37.06
hardware: V100
backend: PyTorch
batch size: 1
mode: FP32
resolution: (512,512)
Training Memory (GB): 5.33
Results:
- Task: Semantic Segmentation
Dataset: ADE20K
Metrics:
mIoU: 49.36
Config: configs/maskformer/maskformer_swin-s_upernet_8xb2-160k_ade20k-512x512.py
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/maskformer/maskformer_swin-s_upernet_8xb2-160k_ade20k-512x512/maskformer_swin-s_upernet_8xb2-160k_ade20k-512x512_20221115_114710-5ab67e58.pth