315 lines
14 KiB
YAML
315 lines
14 KiB
YAML
Collections:
|
|
- Name: Mask2Former
|
|
License: Apache License 2.0
|
|
Metadata:
|
|
Training Data:
|
|
- Usage
|
|
- Cityscapes
|
|
- ADE20K
|
|
Paper:
|
|
Title: Masked-attention Mask Transformer for Universal Image Segmentation
|
|
URL: https://arxiv.org/abs/2112.01527
|
|
README: configs/mask2former/README.md
|
|
Frameworks:
|
|
- PyTorch
|
|
Models:
|
|
- Name: mask2former_r50_8xb2-90k_cityscapes-512x1024
|
|
In Collection: Mask2Former
|
|
Results:
|
|
Task: Semantic Segmentation
|
|
Dataset: Cityscapes
|
|
Metrics:
|
|
mIoU: 80.44
|
|
Config: configs/mask2former/mask2former_r50_8xb2-90k_cityscapes-512x1024.py
|
|
Metadata:
|
|
Training Data: Cityscapes
|
|
Batch Size: 16
|
|
Architecture:
|
|
- R-50-D32
|
|
- Mask2Former
|
|
Training Resources: 8x A100 GPUS
|
|
Memory (GB): 5.67
|
|
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/mask2former/mask2former_r50_8xb2-90k_cityscapes-512x1024/mask2former_r50_8xb2-90k_cityscapes-512x1024_20221202_140802-ffd9d750.pth
|
|
Training log: https://download.openmmlab.com/mmsegmentation/v0.5/mask2former/mask2former_r50_8xb2-90k_cityscapes-512x1024/mask2former_r50_8xb2-90k_cityscapes-512x1024_20221202_140802.json
|
|
Paper:
|
|
Title: Masked-attention Mask Transformer for Universal Image Segmentation
|
|
URL: https://arxiv.org/abs/2112.01527
|
|
Code: https://github.com/open-mmlab/mmdetection/blob/3.x/mmdet/models/dense_heads/mask2former_head.py
|
|
Framework: PyTorch
|
|
- Name: mask2former_r101_8xb2-90k_cityscapes-512x1024
|
|
In Collection: Mask2Former
|
|
Results:
|
|
Task: Semantic Segmentation
|
|
Dataset: Cityscapes
|
|
Metrics:
|
|
mIoU: 80.8
|
|
Config: configs/mask2former/mask2former_r101_8xb2-90k_cityscapes-512x1024.py
|
|
Metadata:
|
|
Training Data: Cityscapes
|
|
Batch Size: 16
|
|
Architecture:
|
|
- R-101-D32
|
|
- Mask2Former
|
|
Training Resources: 8x A100 GPUS
|
|
Memory (GB): 6.81
|
|
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/mask2former/mask2former_r101_8xb2-90k_cityscapes-512x1024/mask2former_r101_8xb2-90k_cityscapes-512x1024_20221130_031628-43e68666.pth
|
|
Training log: https://download.openmmlab.com/mmsegmentation/v0.5/mask2former/mask2former_r101_8xb2-90k_cityscapes-512x1024/mask2former_r101_8xb2-90k_cityscapes-512x1024_20221130_031628.json
|
|
Paper:
|
|
Title: Masked-attention Mask Transformer for Universal Image Segmentation
|
|
URL: https://arxiv.org/abs/2112.01527
|
|
Code: https://github.com/open-mmlab/mmdetection/blob/3.x/mmdet/models/dense_heads/mask2former_head.py
|
|
Framework: PyTorch
|
|
- Name: mask2former_swin-t_8xb2-90k_cityscapes-512x1024
|
|
In Collection: Mask2Former
|
|
Results:
|
|
Task: Semantic Segmentation
|
|
Dataset: Cityscapes
|
|
Metrics:
|
|
mIoU: 81.71
|
|
Config: configs/mask2former/mask2former_swin-t_8xb2-90k_cityscapes-512x1024.py
|
|
Metadata:
|
|
Training Data: Cityscapes
|
|
Batch Size: 16
|
|
Architecture:
|
|
- Swin-T
|
|
- Mask2Former
|
|
Training Resources: 8x A100 GPUS
|
|
Memory (GB): 6.36
|
|
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/mask2former/mask2former_swin-t_8xb2-90k_cityscapes-512x1024/mask2former_swin-t_8xb2-90k_cityscapes-512x1024_20221127_144501-36c59341.pth
|
|
Training log: https://download.openmmlab.com/mmsegmentation/v0.5/mask2former/mask2former_swin-t_8xb2-90k_cityscapes-512x1024/mask2former_swin-t_8xb2-90k_cityscapes-512x1024_20221127_144501.json
|
|
Paper:
|
|
Title: Masked-attention Mask Transformer for Universal Image Segmentation
|
|
URL: https://arxiv.org/abs/2112.01527
|
|
Code: https://github.com/open-mmlab/mmdetection/blob/3.x/mmdet/models/dense_heads/mask2former_head.py
|
|
Framework: PyTorch
|
|
- Name: mask2former_swin-s_8xb2-90k_cityscapes-512x1024
|
|
In Collection: Mask2Former
|
|
Results:
|
|
Task: Semantic Segmentation
|
|
Dataset: Cityscapes
|
|
Metrics:
|
|
mIoU: 82.57
|
|
Config: configs/mask2former/mask2former_swin-s_8xb2-90k_cityscapes-512x1024.py
|
|
Metadata:
|
|
Training Data: Cityscapes
|
|
Batch Size: 16
|
|
Architecture:
|
|
- Swin-S
|
|
- Mask2Former
|
|
Training Resources: 8x A100 GPUS
|
|
Memory (GB): 8.09
|
|
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/mask2former/mask2former_swin-s_8xb2-90k_cityscapes-512x1024/mask2former_swin-s_8xb2-90k_cityscapes-512x1024_20221127_143802-9ab177f6.pth
|
|
Training log: https://download.openmmlab.com/mmsegmentation/v0.5/mask2former/mask2former_swin-s_8xb2-90k_cityscapes-512x1024/mask2former_swin-s_8xb2-90k_cityscapes-512x1024_20221127_143802.json
|
|
Paper:
|
|
Title: Masked-attention Mask Transformer for Universal Image Segmentation
|
|
URL: https://arxiv.org/abs/2112.01527
|
|
Code: https://github.com/open-mmlab/mmdetection/blob/3.x/mmdet/models/dense_heads/mask2former_head.py
|
|
Framework: PyTorch
|
|
- Name: mask2former_swin-b-in22k-384x384-pre_8xb2-90k_cityscapes-512x1024
|
|
In Collection: Mask2Former
|
|
Results:
|
|
Task: Semantic Segmentation
|
|
Dataset: Cityscapes
|
|
Metrics:
|
|
mIoU: 83.52
|
|
Config: configs/mask2former/mask2former_swin-b-in22k-384x384-pre_8xb2-90k_cityscapes-512x1024.py
|
|
Metadata:
|
|
Training Data: Cityscapes
|
|
Batch Size: 16
|
|
Architecture:
|
|
- Swin-B
|
|
- Mask2Former
|
|
Training Resources: 8x A100 GPUS
|
|
Memory (GB): 10.89
|
|
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/mask2former/mask2former_swin-b-in22k-384x384-pre_8xb2-90k_cityscapes-512x1024/mask2former_swin-b-in22k-384x384-pre_8xb2-90k_cityscapes-512x1024_20221203_045030-9a86a225.pth
|
|
Training log: https://download.openmmlab.com/mmsegmentation/v0.5/mask2former/mask2former_swin-b-in22k-384x384-pre_8xb2-90k_cityscapes-512x1024/mask2former_swin-b-in22k-384x384-pre_8xb2-90k_cityscapes-512x1024_20221203_045030.json
|
|
Paper:
|
|
Title: Masked-attention Mask Transformer for Universal Image Segmentation
|
|
URL: https://arxiv.org/abs/2112.01527
|
|
Code: https://github.com/open-mmlab/mmdetection/blob/3.x/mmdet/models/dense_heads/mask2former_head.py
|
|
Framework: PyTorch
|
|
- Name: mask2former_swin-l-in22k-384x384-pre_8xb2-90k_cityscapes-512x1024
|
|
In Collection: Mask2Former
|
|
Results:
|
|
Task: Semantic Segmentation
|
|
Dataset: Cityscapes
|
|
Metrics:
|
|
mIoU: 83.65
|
|
Config: configs/mask2former/mask2former_swin-l-in22k-384x384-pre_8xb2-90k_cityscapes-512x1024.py
|
|
Metadata:
|
|
Training Data: Cityscapes
|
|
Batch Size: 16
|
|
Architecture:
|
|
- Swin-L
|
|
- Mask2Former
|
|
Training Resources: 8x A100 GPUS
|
|
Memory (GB): 15.83
|
|
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/mask2former/mask2former_swin-l-in22k-384x384-pre_8xb2-90k_cityscapes-512x1024/mask2former_swin-l-in22k-384x384-pre_8xb2-90k_cityscapes-512x1024_20221202_141901-28ad20f1.pth
|
|
Training log: https://download.openmmlab.com/mmsegmentation/v0.5/mask2former/mask2former_swin-l-in22k-384x384-pre_8xb2-90k_cityscapes-512x1024/mask2former_swin-l-in22k-384x384-pre_8xb2-90k_cityscapes-512x1024_20221202_141901.json
|
|
Paper:
|
|
Title: Masked-attention Mask Transformer for Universal Image Segmentation
|
|
URL: https://arxiv.org/abs/2112.01527
|
|
Code: https://github.com/open-mmlab/mmdetection/blob/3.x/mmdet/models/dense_heads/mask2former_head.py
|
|
Framework: PyTorch
|
|
- Name: mask2former_r50_8xb2-160k_ade20k-512x512
|
|
In Collection: Mask2Former
|
|
Results:
|
|
Task: Semantic Segmentation
|
|
Dataset: ADE20K
|
|
Metrics:
|
|
mIoU: 47.87
|
|
Config: configs/mask2former/mask2former_r50_8xb2-160k_ade20k-512x512.py
|
|
Metadata:
|
|
Training Data: ADE20K
|
|
Batch Size: 16
|
|
Architecture:
|
|
- R-50-D32
|
|
- Mask2Former
|
|
Training Resources: 8x A100 GPUS
|
|
Memory (GB): 3.31
|
|
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/mask2former/mask2former_r50_8xb2-160k_ade20k-512x512/mask2former_r50_8xb2-160k_ade20k-512x512_20221204_000055-2d1f55f1.pth
|
|
Training log: https://download.openmmlab.com/mmsegmentation/v0.5/mask2former/mask2former_r50_8xb2-160k_ade20k-512x512/mask2former_r50_8xb2-160k_ade20k-512x512_20221204_000055.json
|
|
Paper:
|
|
Title: Masked-attention Mask Transformer for Universal Image Segmentation
|
|
URL: https://arxiv.org/abs/2112.01527
|
|
Code: https://github.com/open-mmlab/mmdetection/blob/3.x/mmdet/models/dense_heads/mask2former_head.py
|
|
Framework: PyTorch
|
|
- Name: mask2former_r101_8xb2-160k_ade20k-512x512
|
|
In Collection: Mask2Former
|
|
Results:
|
|
Task: Semantic Segmentation
|
|
Dataset: ADE20K
|
|
Metrics:
|
|
mIoU: 48.6
|
|
Config: configs/mask2former/mask2former_r101_8xb2-160k_ade20k-512x512.py
|
|
Metadata:
|
|
Training Data: ADE20K
|
|
Batch Size: 16
|
|
Architecture:
|
|
- R-101-D32
|
|
- Mask2Former
|
|
Training Resources: 8x A100 GPUS
|
|
Memory (GB): 4.09
|
|
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/mask2former/mask2former_r101_8xb2-160k_ade20k-512x512/mask2former_r101_8xb2-160k_ade20k-512x512_20221203_233905-b7135890.pth
|
|
Training log: https://download.openmmlab.com/mmsegmentation/v0.5/mask2former/mask2former_r101_8xb2-160k_ade20k-512x512/mask2former_r101_8xb2-160k_ade20k-512x512_20221203_233905.json
|
|
Paper:
|
|
Title: Masked-attention Mask Transformer for Universal Image Segmentation
|
|
URL: https://arxiv.org/abs/2112.01527
|
|
Code: https://github.com/open-mmlab/mmdetection/blob/3.x/mmdet/models/dense_heads/mask2former_head.py
|
|
Framework: PyTorch
|
|
- Name: mask2former_swin-t_8xb2-160k_ade20k-512x512
|
|
In Collection: Mask2Former
|
|
Results:
|
|
Task: Semantic Segmentation
|
|
Dataset: ADE20K
|
|
Metrics:
|
|
mIoU: 48.66
|
|
Config: configs/mask2former/mask2former_swin-t_8xb2-160k_ade20k-512x512.py
|
|
Metadata:
|
|
Training Data: ADE20K
|
|
Batch Size: 16
|
|
Architecture:
|
|
- Swin-T
|
|
- Mask2Former
|
|
Training Resources: 8x A100 GPUS
|
|
Memory (GB): 3826.0
|
|
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/mask2former/mask2former_swin-t_8xb2-160k_ade20k-512x512/mask2former_swin-t_8xb2-160k_ade20k-512x512_20221203_234230-7d64e5dd.pth
|
|
Training log: https://download.openmmlab.com/mmsegmentation/v0.5/mask2former/mask2former_swin-t_8xb2-160k_ade20k-512x512/mask2former_swin-t_8xb2-160k_ade20k-512x512_20221203_234230.json
|
|
Paper:
|
|
Title: Masked-attention Mask Transformer for Universal Image Segmentation
|
|
URL: https://arxiv.org/abs/2112.01527
|
|
Code: https://github.com/open-mmlab/mmdetection/blob/3.x/mmdet/models/dense_heads/mask2former_head.py
|
|
Framework: PyTorch
|
|
- Name: mask2former_swin-s_8xb2-160k_ade20k-512x512
|
|
In Collection: Mask2Former
|
|
Results:
|
|
Task: Semantic Segmentation
|
|
Dataset: ADE20K
|
|
Metrics:
|
|
mIoU: 51.24
|
|
Config: configs/mask2former/mask2former_swin-s_8xb2-160k_ade20k-512x512.py
|
|
Metadata:
|
|
Training Data: ADE20K
|
|
Batch Size: 16
|
|
Architecture:
|
|
- Swin-S
|
|
- Mask2Former
|
|
Training Resources: 8x A100 GPUS
|
|
Memory (GB): 3.74
|
|
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/mask2former/mask2former_swin-s_8xb2-160k_ade20k-512x512/mask2former_swin-s_8xb2-160k_ade20k-512x512_20221204_143905-e715144e.pth
|
|
Training log: https://download.openmmlab.com/mmsegmentation/v0.5/mask2former/mask2former_swin-s_8xb2-160k_ade20k-512x512/mask2former_swin-s_8xb2-160k_ade20k-512x512_20221204_143905.json
|
|
Paper:
|
|
Title: Masked-attention Mask Transformer for Universal Image Segmentation
|
|
URL: https://arxiv.org/abs/2112.01527
|
|
Code: https://github.com/open-mmlab/mmdetection/blob/3.x/mmdet/models/dense_heads/mask2former_head.py
|
|
Framework: PyTorch
|
|
- Name: mask2former_swin-b-in1k-384x384-pre_8xb2-160k_ade20k-640x640
|
|
In Collection: Mask2Former
|
|
Results:
|
|
Task: Semantic Segmentation
|
|
Dataset: ADE20K
|
|
Metrics:
|
|
mIoU: 52.44
|
|
Config: configs/mask2former/mask2former_swin-b-in1k-384x384-pre_8xb2-160k_ade20k-640x640.py
|
|
Metadata:
|
|
Training Data: ADE20K
|
|
Batch Size: 16
|
|
Architecture:
|
|
- Swin-B
|
|
- Mask2Former
|
|
Training Resources: 8x A100 GPUS
|
|
Memory (GB): 5.66
|
|
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/mask2former/mask2former_swin-b-in1k-384x384-pre_8xb2-160k_ade20k-640x640/mask2former_swin-b-in1k-384x384-pre_8xb2-160k_ade20k-640x640_20221129_125118-a4a086d2.pth
|
|
Training log: https://download.openmmlab.com/mmsegmentation/v0.5/mask2former/mask2former_swin-b-in1k-384x384-pre_8xb2-160k_ade20k-640x640/mask2former_swin-b-in1k-384x384-pre_8xb2-160k_ade20k-640x640_20221129_125118.json
|
|
Paper:
|
|
Title: Masked-attention Mask Transformer for Universal Image Segmentation
|
|
URL: https://arxiv.org/abs/2112.01527
|
|
Code: https://github.com/open-mmlab/mmdetection/blob/3.x/mmdet/models/dense_heads/mask2former_head.py
|
|
Framework: PyTorch
|
|
- Name: mask2former_swin-b-in22k-384x384-pre_8xb2-160k_ade20k-640x640
|
|
In Collection: Mask2Former
|
|
Results:
|
|
Task: Semantic Segmentation
|
|
Dataset: ADE20K
|
|
Metrics:
|
|
mIoU: 53.9
|
|
Config: configs/mask2former/mask2former_swin-b-in22k-384x384-pre_8xb2-160k_ade20k-640x640.py
|
|
Metadata:
|
|
Training Data: ADE20K
|
|
Batch Size: 16
|
|
Architecture:
|
|
- Swin-B
|
|
- Mask2Former
|
|
Training Resources: 8x A100 GPUS
|
|
Memory (GB): 5.66
|
|
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/mask2former/mask2former_swin-b-in22k-384x384-pre_8xb2-160k_ade20k-640x640/mask2former_swin-b-in22k-384x384-pre_8xb2-160k_ade20k-640x640_20221203_235230-7ec0f569.pth
|
|
Training log: https://download.openmmlab.com/mmsegmentation/v0.5/mask2former/mask2former_swin-b-in22k-384x384-pre_8xb2-160k_ade20k-640x640/mask2former_swin-b-in22k-384x384-pre_8xb2-160k_ade20k-640x640_20221203_235230.json
|
|
Paper:
|
|
Title: Masked-attention Mask Transformer for Universal Image Segmentation
|
|
URL: https://arxiv.org/abs/2112.01527
|
|
Code: https://github.com/open-mmlab/mmdetection/blob/3.x/mmdet/models/dense_heads/mask2former_head.py
|
|
Framework: PyTorch
|
|
- Name: mask2former_swin-l-in22k-384x384-pre_8xb2-160k_ade20k-640x640
|
|
In Collection: Mask2Former
|
|
Results:
|
|
Task: Semantic Segmentation
|
|
Dataset: ADE20K
|
|
Metrics:
|
|
mIoU: 56.01
|
|
Config: configs/mask2former/mask2former_swin-l-in22k-384x384-pre_8xb2-160k_ade20k-640x640.py
|
|
Metadata:
|
|
Training Data: ADE20K
|
|
Batch Size: 16
|
|
Architecture:
|
|
- Swin-L
|
|
- Mask2Former
|
|
Training Resources: 8x A100 GPUS
|
|
Memory (GB): 8.86
|
|
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/mask2former/mask2former_swin-l-in22k-384x384-pre_8xb2-160k_ade20k-640x640/mask2former_swin-l-in22k-384x384-pre_8xb2-160k_ade20k-640x640_20221203_235933-7120c214.pth
|
|
Training log: https://download.openmmlab.com/mmsegmentation/v0.5/mask2former/mask2former_swin-l-in22k-384x384-pre_8xb2-160k_ade20k-640x640/mask2former_swin-l-in22k-384x384-pre_8xb2-160k_ade20k-640x640_20221203_235933.json
|
|
Paper:
|
|
Title: Masked-attention Mask Transformer for Universal Image Segmentation
|
|
URL: https://arxiv.org/abs/2112.01527
|
|
Code: https://github.com/open-mmlab/mmdetection/blob/3.x/mmdet/models/dense_heads/mask2former_head.py
|
|
Framework: PyTorch
|