## Motivation
The DETR-related modules have been refactored in
open-mmlab/mmdetection#8763, which causes breakings of MaskFormer and
Mask2Former in both MMDetection (has been fixed in
open-mmlab/mmdetection#9515) and MMSegmentation. This pr fix the bugs in
MMSegmentation.
### TO-DO List
- [x] update configs
- [x] check and modify data flow
- [x] fix unit test
- [x] aligning inference
- [x] write a ckpt converter
- [x] write ckpt update script
- [x] update model zoo
- [x] update model link in readme
- [x] update
[faq.md](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/docs/en/notes/faq.md#installation)
## Tips of Fixing other implementations based on MaskXFormer of mmseg
1. The Transformer modules should be built directly. The original
building with register manner has been refactored.
2. The config requires to be modified. Delete `type` and modify several
keys, according to the modifications in this pr.
3. The `batch_first` is set `True` uniformly in the new implementations.
Hence the data flow requires to be transposed and config of
`batch_first` needs to be modified.
4. The checkpoint trained on the old implementation should be converted
to be used in the new one.
### Convert script
```Python
import argparse
from copy import deepcopy
from collections import OrderedDict
import torch
from mmengine.config import Config
from mmseg.models import build_segmentor
from mmseg.utils import register_all_modules
register_all_modules(init_default_scope=True)
def parse_args():
parser = argparse.ArgumentParser(
description='MMSeg convert MaskXFormer model, by Li-Qingyun')
parser.add_argument('Mask_what_former', type=int,
help='Mask what former, can be a `1` or `2`',
choices=[1, 2])
parser.add_argument('CFG_FILE', help='config file path')
parser.add_argument('OLD_CKPT_FILEPATH', help='old ckpt file path')
parser.add_argument('NEW_CKPT_FILEPATH', help='new ckpt file path')
args = parser.parse_args()
return args
args = parse_args()
def get_new_name(old_name: str):
new_name = old_name
if 'encoder.layers' in new_name:
new_name = new_name.replace('attentions.0', 'self_attn')
new_name = new_name.replace('ffns.0', 'ffn')
if 'decoder.layers' in new_name:
if args.Mask_what_former == 2:
# for Mask2Former
new_name = new_name.replace('attentions.0', 'cross_attn')
new_name = new_name.replace('attentions.1', 'self_attn')
else:
# for Mask2Former
new_name = new_name.replace('attentions.0', 'self_attn')
new_name = new_name.replace('attentions.1', 'cross_attn')
return new_name
def cvt_sd(old_sd: OrderedDict):
new_sd = OrderedDict()
for name, param in old_sd.items():
new_name = get_new_name(name)
assert new_name not in new_sd
new_sd[new_name] = param
assert len(new_sd) == len(old_sd)
return new_sd
if __name__ == '__main__':
cfg = Config.fromfile(args.CFG_FILE)
model_cfg = cfg.model
segmentor = build_segmentor(model_cfg)
refer_sd = segmentor.state_dict()
old_ckpt = torch.load(args.OLD_CKPT_FILEPATH)
old_sd = old_ckpt['state_dict']
new_sd = cvt_sd(old_sd)
print(segmentor.load_state_dict(new_sd))
new_ckpt = deepcopy(old_ckpt)
new_ckpt['state_dict'] = new_sd
torch.save(new_ckpt, args.NEW_CKPT_FILEPATH)
print(f'{args.NEW_CKPT_FILEPATH} has been saved!')
```
Usage:
```bash
# for example
python ckpt4pr2532.py 1 configs/maskformer/maskformer_r50-d32_8xb2-160k_ade20k-512x512.py original_ckpts/maskformer_r50-d32_8xb2-160k_ade20k-512x512_20221030_182724-cbd39cc1.pth cvt_outputs/maskformer_r50-d32_8xb2-160k_ade20k-512x512_20221030_182724.pth
python ckpt4pr2532.py 2 configs/mask2former/mask2former_r50_8xb2-160k_ade20k-512x512.py original_ckpts/mask2former_r50_8xb2-160k_ade20k-512x512_20221204_000055-4c62652d.pth cvt_outputs/mask2former_r50_8xb2-160k_ade20k-512x512_20221204_000055.pth
```
---------
Co-authored-by: MeowZheng <meowzheng@outlook.com>
* knet first commit
* fix import error in knet
* remove kernel update head from decoder head
* [Feature] Add kenerl updation for some decoder heads.
* [Feature] Add kenerl updation for some decoder heads.
* directly use forward_feature && modify other 3 decoder heads
* remover kernel_update attr
* delete unnecessary variables in forward function
* delete kernel update function
* delete kernel update function
* delete kernel_generate_head
* add unit test & comments in knet.py
* add copyright to fix lint error
* modify config names of knet
* rename swin-l 640
* upload models&logs and refactor knet_head.py
* modify docstrings and add some ut
* add url, modify docstring and add loss ut
* modify docstrings
* fix single loss type
* fix error in ohem & point_head
* fix coverage miss
* fix uncoverage error of PointHead loss
* fix coverage miss
* fix uncoverage error of PointHead loss
* nn.modules.container.ModuleList to nn.ModuleList
* more simple format
* merge unittest def
* [Feature]Segformer re-implementation
* Using act_cfg and norm_cfg to control activation and normalization
* Split this PR into several little PRs
* Fix lint error
* Remove SegFormerHead
* [Feature] Add segformer decode head and related train config
* Add ade20K trainval support for segformer
1. Add related train and val configs;
2. Add AlignedResize;
* Set arg: find_unused_parameters = True
* parameters init refactor
* 1. Refactor segformer backbone parameters init;
2. Remove rebundant functions and unit tests;
* Remove rebundant codes
* Replace Linear Layer to 1X1 Conv
* Use nn.ModuleList to refactor segformer head.
* Remove local to_xtuple
* 1. Remove rebundant codes;
2. Modify module name;
* Refactor the backbone of segformer using mmcv.cnn.bricks.transformer.py
* Fix some code logic bugs.
* Add mit_convert.py to match pretrain keys of segformer.
* Resolve some comments.
* 1. Add some assert to ensure right params;
2. Support flexible peconv position;
* Add pe_index assert and fix unit test.
* 1. Add doc string for MixVisionTransformer;
2. Add some unit tests for MixVisionTransformer;
* Use hw_shape to pass shape of feature map.
* 1. Fix doc string of MixVisionTransformer;
2. Simplify MixFFN;
3. Modify H, W to hw_shape;
* Add more unit tests.
* Add doc string for shape convertion functions.
* Add some unit tests to improve code coverage.
* Fix Segformer backbone pretrain weights match bug.
* Modify configs of segformer.
* resolve the shape convertion functions doc string.
* Add pad_to_patch_size arg.
* Support progressive test with fewer memory cost.
* Modify default value of pad_to_patch_size arg.
* Temp code
* Using processor to refactor evaluation workflow.
* refactor eval hook.
* Fix process bar.
* Fix middle save argument.
* Modify some variable name of dataset evaluate api.
* Modify some viriable name of eval hook.
* Fix some priority bugs of eval hook.
* Fix some bugs about model loading and eval hook.
* Add ade20k 640x640 dataset.
* Fix related segformer configs.
* Depreciated efficient_test.
* Fix training progress blocked by eval hook.
* Depreciated old test api.
* Modify error patch size.
* Fix pretrain of mit_b0
* Fix the test api error.
* Modify dataset base config.
* Fix test api error.
* Modify outer api.
* Build a sampler test api.
* TODO: Refactor format_results.
* Modify variable names.
* Fix num_classes bug.
* Fix sampler index bug.
* Fix grammaly bug.
* Add part of benchmark results.
* Support batch sampler.
* More readable test api.
* Remove some command arg and fix eval hook bug.
* Support format-only arg.
* Modify format_results of datasets.
* Modify tool which use test apis.
* Update readme.
* Update readme of segformer.
* Updata readme of segformer.
* Update segformer readme and fix segformer mit_b4.
* Update readme of segformer.
* Clean AlignedResize related config.
* Clean code from pr #709
* Clean code from pr #709
* Add 512x512 segformer_mit-b5.
* Fix lint.
* Fix some segformer head bugs.
* Add segformer unit tests.
* Replace AlignedResize to ResizeToMultiple.
* Modify readme of segformer.
* Fix bug of ResizeToMultiple.
* Add ResizeToMultiple unit tests.
* Resolve conflict.
* Simplify the implementation of ResizeToMultiple.
* Update test results.
* Fix multi-scale test error when resize_ratio=1.75 and input size=640x640.
* Update segformer results.
* Update Segformer results.
* Fix some url bugs and pipelines bug.
* Move ckpt convertion to tools.
* Add segformer official pretrain weights usage.
* Clean redundant codes.
* Remove redundant codes.
* Unfied format.
* Add description for segformer converter.
* Update workers.
* Adjust vision transformer backbone architectures;
* Add DropPath, trunc_normal_ for VisionTransformer implementation;
* Add class token buring intermediate period and remove it during final period;
* Fix some parameters loss bug;
* * Store intermediate token features and impose no processes on them;
* Remove class token and reshape entire token feature from NLC to NCHW;
* Fix some doc error
* Add a arg for VisionTransformer backbone to control if input class token into transformer;
* Add stochastic depth decay rule for DropPath;
* * Fix output bug when input_cls_token=False;
* Add related unit test;
* Re-implement of SETR
* Add two head -- SETRUPHead (Naive, PUP) & SETRMLAHead (MLA);
* * Modify some docs of heads of SETR;
* Add MLA auxiliary head of SETR;
* * Modify some arg of setr heads;
* Add unit test for setr heads;
* * Add 768x768 cityscapes dataset config;
* Add Backbone: SETR -- Backbone: MLA, PUP, Naive;
* Add SETR cityscapes training & testing config;
* * Fix the low code coverage of unit test about heads of setr;
* Remove some rebundant error capture;
* * Add pascal context dataset & ade20k dataset config;
* Modify auxiliary head relative config;
* Modify folder structure.
* add setr
* modify vit
* Fix the test_cfg arg position;
* Fix some learning schedule bug;
* optimize setr code
* Add arg: final_reshape to control if converting output feature information from NLC to NCHW;
* Fix the default value of final_reshape;
* Modify arg: final_reshape to arg: out_shape;
* Fix some unit test bug;
* Add MLA neck;
* Modify setr configs to add MLA neck;
* Modify MLA decode head to remove rebundant structure;
* Remove some rebundant files.
* * Fix the code style bug;
* Remove some rebundant files;
* Modify some unit tests of SETR;
* Ignoring CityscapesCoarseDataset and MapillaryDataset.
* Fix the activation function loss bug;
* Fix the img_size bug of SETR_PUP_ADE20K
* * Fix the lint bug of transformers.py;
* Add mla neck unit test;
* Convert vit of setr out shape from NLC to NCHW.
* * Modify Resize action of data pipeline;
* Fix deit related bug;
* Set find_unused_parameters=False for pascal context dataset;
* Remove arg: find_unused_parameters which is False by default.
* Error auxiliary head of PUP deit
* Remove the minimal restrict of slide inference.
* Modify doc string of Resize
* Seperate this part of code to a new PR #544
* * Remove some rebundant codes;
* Modify unit tests of SETR heads;
* Fix the tuple in_channels of mla_deit.
* Modify code style
* Move detailed definition of auxiliary head into model config dict;
* Add some setr config for default cityscapes.py;
* Fix the doc string of SETR head;
* Modify implementation of SETR Heads
* Remove setr aux head and use fcn head to replace it;
* Remove arg: img_size and remove last interpolate op of heads;
* Rename arg: conv3x3_conv1x1 to kernel_size of SETRUPHead;
* non-square input support for setr heads
* Modify config argument for above commits
* Remove norm_layer argument of SETRMLAHead
* Add mla_align_corners for MLAModule interpolate
* [Refactor]Refactor of SETRMLAHead
* Modify Head implementation;
* Modify Head unit test;
* Modify related config file;
* [Refactor]MLA Neck
* Fix config bug
* [Refactor]SETR Naive Head and SETR PUP Head
* [Fix]Fix the lack of arg: act_cfg and arg: norm_cfg
* Fix config error
* Refactor of SETR MLA, Naive, PUP heads.
* Modify some attribute name of SETR Heads.
* Modify setr configs to adapt new vit code.
* Fix trunc_normal_ bug
* Parameters init adjustment.
* Remove redundant doc string of SETRUPHead
* Fix pretrained bug
* [Fix] Fix vit init bug
* Add some vit unit tests
* Modify module import
* Remove norm from PatchEmbed
* Fix pretrain weights bug
* Modify pretrained judge
* Fix some gradient backward bugs.
* Add some unit tests to improve code cov
* Fix init_weights of setr up head
* Add DropPath in FFN
* Finish benchmark of SETR
1. Add benchmark information into README.MD of SETR;
2. Fix some name bugs of vit;
* Remove DropPath implementation and use DropPath from mmcv.
* Modify out_indices arg
* Fix out_indices bug.
* Remove cityscapes base dataset config.
Co-authored-by: sennnnn <201730271412@mail.scut.edu.cn>
Co-authored-by: CuttlefishXuan <zhaoxinxuan1997@gmail.com>