sennnnn b4fd32d049 [Feature] Add segformer decode head and related train config (#599)
* [Feature]Segformer re-implementation

* Using act_cfg and norm_cfg to control activation and normalization

* Split this PR into several little PRs

* Fix lint error

* Remove SegFormerHead

* [Feature] Add segformer decode head and related train config

* Add ade20K trainval support for segformer

1. Add related train and val configs;

2. Add AlignedResize;

* Set arg: find_unused_parameters = True

* parameters init refactor

* 1. Refactor segformer backbone parameters init;

2. Remove rebundant functions and unit tests;

* Remove rebundant codes

* Replace Linear Layer to 1X1 Conv

* Use nn.ModuleList to refactor segformer head.

* Remove local to_xtuple

* 1. Remove rebundant codes;

2. Modify module name;

* Refactor the backbone of segformer using mmcv.cnn.bricks.transformer.py

* Fix some code logic bugs.

* Add mit_convert.py to match pretrain keys of segformer.

* Resolve some comments.

* 1. Add some assert to ensure right params;

2. Support flexible peconv position;

* Add pe_index assert and fix unit test.

* 1. Add doc string for MixVisionTransformer;

2. Add some unit tests for MixVisionTransformer;

* Use hw_shape to pass shape of feature map.

* 1. Fix doc string of MixVisionTransformer;

2. Simplify MixFFN;

3. Modify H, W to hw_shape;

* Add more unit tests.

* Add doc string for shape convertion functions.

* Add some unit tests to improve code coverage.

* Fix Segformer backbone pretrain weights match bug.

* Modify configs of segformer.

* resolve the shape convertion functions doc string.

* Add pad_to_patch_size arg.

* Support progressive test with fewer memory cost.

* Modify default value of pad_to_patch_size arg.

* Temp code

* Using processor to refactor evaluation workflow.

* refactor eval hook.

* Fix process bar.

* Fix middle save argument.

* Modify some variable name of dataset evaluate api.

* Modify some viriable name of eval hook.

* Fix some priority bugs of eval hook.

* Fix some bugs about model loading and eval hook.

* Add ade20k 640x640 dataset.

* Fix related segformer configs.

* Depreciated efficient_test.

* Fix training progress blocked by eval hook.

* Depreciated old test api.

* Modify error patch size.

* Fix pretrain of mit_b0

* Fix the test api error.

* Modify dataset base config.

* Fix test api error.

* Modify outer api.

* Build a sampler test api.

* TODO: Refactor format_results.

* Modify variable names.

* Fix num_classes bug.

* Fix sampler index bug.

* Fix grammaly bug.

* Add part of benchmark results.

* Support batch sampler.

* More readable test api.

* Remove some command arg and fix eval hook bug.

* Support format-only arg.

* Modify format_results of datasets.

* Modify tool which use test apis.

* Update readme.

* Update readme of segformer.

* Updata readme of segformer.

* Update segformer readme and fix segformer mit_b4.

* Update readme of segformer.

* Clean AlignedResize related config.

* Clean code from pr #709

* Clean code from pr #709

* Add 512x512 segformer_mit-b5.

* Fix lint.

* Fix some segformer head bugs.

* Add segformer unit tests.

* Replace AlignedResize to ResizeToMultiple.

* Modify readme of segformer.

* Fix bug of ResizeToMultiple.

* Add ResizeToMultiple unit tests.

* Resolve conflict.

* Simplify the implementation of ResizeToMultiple.

* Update test results.

* Fix multi-scale test error when resize_ratio=1.75 and input size=640x640.

* Update segformer results.

* Update Segformer results.

* Fix some url bugs and pipelines bug.

* Move ckpt convertion to tools.

* Add segformer official pretrain weights usage.

* Clean redundant codes.

* Remove redundant codes.

* Unfied format.

* Add description for segformer converter.

* Update workers.
2021-08-13 13:31:19 +08:00

77 lines
2.7 KiB
Python

import argparse
from collections import OrderedDict
import torch
def mit_convert(ckpt):
new_ckpt = OrderedDict()
# Process the concat between q linear weights and kv linear weights
for k, v in ckpt.items():
if k.startswith('head'):
continue
# patch embedding convertion
elif k.startswith('patch_embed'):
stage_i = int(k.split('.')[0].replace('patch_embed', ''))
new_k = k.replace(f'patch_embed{stage_i}', f'layers.{stage_i-1}.0')
new_v = v
if 'proj.' in new_k:
new_k = new_k.replace('proj.', 'projection.')
# transformer encoder layer convertion
elif k.startswith('block'):
stage_i = int(k.split('.')[0].replace('block', ''))
new_k = k.replace(f'block{stage_i}', f'layers.{stage_i-1}.1')
new_v = v
if 'attn.q.' in new_k:
sub_item_k = k.replace('q.', 'kv.')
new_k = new_k.replace('q.', 'attn.in_proj_')
new_v = torch.cat([v, ckpt[sub_item_k]], dim=0)
elif 'attn.kv.' in new_k:
continue
elif 'attn.proj.' in new_k:
new_k = new_k.replace('proj.', 'attn.out_proj.')
elif 'attn.sr.' in new_k:
new_k = new_k.replace('sr.', 'sr.')
elif 'mlp.' in new_k:
string = f'{new_k}-'
new_k = new_k.replace('mlp.', 'ffn.layers.')
if 'fc1.weight' in new_k or 'fc2.weight' in new_k:
new_v = v.reshape((*v.shape, 1, 1))
new_k = new_k.replace('fc1.', '0.')
new_k = new_k.replace('dwconv.dwconv.', '1.')
new_k = new_k.replace('fc2.', '4.')
string += f'{new_k} {v.shape}-{new_v.shape}'
# norm layer convertion
elif k.startswith('norm'):
stage_i = int(k.split('.')[0].replace('norm', ''))
new_k = k.replace(f'norm{stage_i}', f'layers.{stage_i-1}.2')
new_v = v
else:
new_k = k
new_v = v
new_ckpt[new_k] = new_v
return new_ckpt
def parse_args():
parser = argparse.ArgumentParser(
'Convert official segformer backbone weights to mmseg style.')
parser.add_argument(
'src', help='Source path of official segformer backbone weights.')
parser.add_argument(
'dst',
help='Destination path of converted segformer backbone weights.')
return parser.parse_args()
if __name__ == '__main__':
args = parse_args()
src_path = args.src
dst_path = args.dst
ckpt = torch.load(src_path, map_location='cpu')
ckpt = mit_convert(ckpt)
torch.save(ckpt, dst_path)