mmsegmentation

Commit Graph

Author	SHA1	Message	Date
angiecao	608e319eb6	[Feature] Support Side Adapter Network (#3232 ) ## Motivation Support SAN for Open-Vocabulary Semantic Segmentation Paper: [Side Adapter Network for Open-Vocabulary Semantic Segmentation](https://arxiv.org/abs/2302.12242) official Code: [SAN](https://github.com/MendelXu/SAN) ## Modification - Added the parameters of backbone vit for implementing the image encoder of CLIP. - Added text encoder code. - Added segmentor multimodel encoder-decoder code for open-vocabulary semantic segmentation. - Added SideAdapterNetwork decode head code. - Added config files for train and inference. - Added tools for converting pretrained models. - Added loss implementation for mask classification model, such as SAN, Maskformer and remove dependency on mmdetection. - Added test units for text encoder, multimodel encoder-decoder, san decode head and hungarian_assigner. ## Use cases ### Convert Models pretrained SAN model The official pretrained model can be downloaded from [san_clip_vit_b_16.pth](https://huggingface.co/Mendel192/san/blob/main/san_vit_b_16.pth) and [san_clip_vit_large_14.pth](https://huggingface.co/Mendel192/san/blob/main/san_vit_large_14.pth). Use tools/model_converters/san2mmseg.py to convert offcial model into mmseg style. `python tools/model_converters/san2mmseg.py <MODEL_PATH> <OUTPUT_PATH>` pretrained CLIP model Use the CLIP model provided by openai to train SAN. The CLIP model can be download from [ViT-B-16.pt](https://openaipublic.azureedge.net/clip/models/5806e77cd80f8b59890b7e101eabd078d9fb84e6937f9e85e4ecb61988df416f/ViT-B-16.pt) and [ViT-L-14-336px.pt](https://openaipublic.azureedge.net/clip/models/3035c92b350959924f9f00213499208652fc7ea050643e8b385c2dac08641f02/ViT-L-14-336px.pt). Use tools/model_converters/clip2mmseg.py to convert model into mmseg style. `python tools/model_converters/clip2mmseg.py <MODEL_PATH> <OUTPUT_PATH>` ### Inference test san_vit-base-16 model on coco-stuff164k dataset `python tools/test.py ./configs/san/san-vit-b16_coco-stuff164k-640x640.py <TRAINED_MODEL_PATH>` ### Train test san_vit-base-16 model on coco-stuff164k dataset `python tools/train.py ./configs/san/san-vit-b16_coco-stuff164k-640x640.py --cfg-options model.pretrained=<PRETRAINED_MODEL_PATH>` ## Comparision Results ### Train on COCO-Stuff164k \| \| \| mIoU \| mAcc \| pAcc \| \| --------------- \| ----- \| ----- \| ----- \| ----- \| \| san-vit-base16 \| official \| 41.93 \| 56.73 \| 67.69 \| \| \| mmseg \| 41.93 \| 56.84 \| 67.84 \| \| san-vit-large14 \| official \| 45.57 \| 59.52 \| 69.76 \| \| \| mmseg \| 45.78 \| 59.61 \| 69.21 \| ### Evaluate on Pascal Context \| \| \| mIoU \| mAcc \| pAcc \| \| --------------- \| ----- \| ----- \| ----- \| ----- \| \| san-vit-base16 \| official \| 54.05 \| 72.96 \| 77.77 \| \| \| mmseg \| 54.04 \| 73.74 \| 77.71 \| \| san-vit-large14 \| official \| 57.53 \| 77.56 \| 78.89 \| \| \| mmseg \| 56.89 \| 76.96 \| 78.74 \| ### Evaluate on Voc12Aug \| \| \| mIoU \| mAcc \| pAcc \| \| --------------- \| ----- \| ----- \| ----- \| ----- \| \| san-vit-base16 \| official \| 93.86 \| 96.61 \| 97.11 \| \| \| mmseg \| 94.58 \| 97.01 \| 97.38 \| \| san-vit-large14 \| official \| 95.17 \| 97.61 \| 97.63 \| \| \| mmseg \| 95.58 \| 97.75 \| 97.79 \| --------- Co-authored-by: CastleDream <35064479+CastleDream@users.noreply.github.com> Co-authored-by: yeedrag <46050186+yeedrag@users.noreply.github.com> Co-authored-by: Yang-ChangHui <71805205+Yang-Changhui@users.noreply.github.com> Co-authored-by: Xu CAO <49406546+SheffieldCao@users.noreply.github.com> Co-authored-by: xiexinch <xiexinch@outlook.com> Co-authored-by: 小飞猪 <106524776+ooooo-create@users.noreply.github.com>	2023-09-20 21:20:26 +08:00
谢昕辰	dd47cef801	[Feature] Support PIDNet (#2609 ) ## Motivation Support SOTA real-time semantic segmentation method in [Paper with code](https://paperswithcode.com/task/real-time-semantic-segmentation) Paper: https://arxiv.org/pdf/2206.02066.pdf Official repo: https://github.com/XuJiacong/PIDNet ## Current results Cityscapes \|Model\|Ref mIoU\|mIoU (ours)\| \|---\|---\|---\| \|PIDNet-S\|78.8\|78.74\| \|PIDNet-M\|79.9\|80.22\| \|PIDNet-L\|80.9\|80.89\| ## TODO - [x] Support inference with official weights - [x] Support training on Cityscapes - [x] Update docstring - [x] Add unit test	2023-03-15 14:55:30 +08:00
Miao Zheng	e0499d5a77	[Fix] Fix repo based on refactoring standard (#1869 ) * [Fix] Fix repo based on refactory standard * fix ut	2022-08-19 20:50:02 +08:00
Rockey	42df28c7bd	[Feature] add nlc2nchw2nlc and nchw2nlc2nchw (#1249 ) * [Feature] add nlc2nchw2nlc and nchw2nlc2nchw * add example * add test, add **kwargs to make it more universal	2022-03-10 20:27:28 +08:00
Miao Zheng	b97cfa77d2	[Enhancement] Revise pre-commit-hooks (#1315 )	2022-02-23 23:44:27 +08:00
谢昕辰	119bbd838d	[Enhancement] Delete convert function and add instruction to ViT/Swin README.md (#791 ) * delete convert function and add instruction to README.md * unified model convert and README * remove url * fix import error * fix unittest * rename pretrain * rename vit and deit pretrain * Update upernet_deit-b16_512x512_160k_ade20k.py * Update upernet_deit-b16_512x512_80k_ade20k.py * Update upernet_deit-b16_ln_mln_512x512_160k_ade20k.py * Update upernet_deit-b16_mln_512x512_160k_ade20k.py * Update upernet_deit-s16_512x512_160k_ade20k.py * Update upernet_deit-s16_512x512_80k_ade20k.py * Update upernet_deit-s16_ln_mln_512x512_160k_ade20k.py * Update upernet_deit-s16_mln_512x512_160k_ade20k.py Co-authored-by: Jiarui XU <xvjiarui0826@gmail.com> Co-authored-by: Junjun2016 <hejunjun@sjtu.edu.cn>	2021-08-25 15:00:41 -07:00
Junjun2016	2fe0bddf5e	[Dcos] Add header for files (#796 ) * Add header for files * Delete header in config files	2021-08-16 23:16:55 -07:00
sennnnn	b4fd32d049	[Feature] Add segformer decode head and related train config (#599 ) * [Feature]Segformer re-implementation * Using act_cfg and norm_cfg to control activation and normalization * Split this PR into several little PRs * Fix lint error * Remove SegFormerHead * [Feature] Add segformer decode head and related train config * Add ade20K trainval support for segformer 1. Add related train and val configs; 2. Add AlignedResize; * Set arg: find_unused_parameters = True * parameters init refactor * 1. Refactor segformer backbone parameters init; 2. Remove rebundant functions and unit tests; * Remove rebundant codes * Replace Linear Layer to 1X1 Conv * Use nn.ModuleList to refactor segformer head. * Remove local to_xtuple * 1. Remove rebundant codes; 2. Modify module name; * Refactor the backbone of segformer using mmcv.cnn.bricks.transformer.py * Fix some code logic bugs. * Add mit_convert.py to match pretrain keys of segformer. * Resolve some comments. * 1. Add some assert to ensure right params; 2. Support flexible peconv position; * Add pe_index assert and fix unit test. * 1. Add doc string for MixVisionTransformer; 2. Add some unit tests for MixVisionTransformer; * Use hw_shape to pass shape of feature map. * 1. Fix doc string of MixVisionTransformer; 2. Simplify MixFFN; 3. Modify H, W to hw_shape; * Add more unit tests. * Add doc string for shape convertion functions. * Add some unit tests to improve code coverage. * Fix Segformer backbone pretrain weights match bug. * Modify configs of segformer. * resolve the shape convertion functions doc string. * Add pad_to_patch_size arg. * Support progressive test with fewer memory cost. * Modify default value of pad_to_patch_size arg. * Temp code * Using processor to refactor evaluation workflow. * refactor eval hook. * Fix process bar. * Fix middle save argument. * Modify some variable name of dataset evaluate api. * Modify some viriable name of eval hook. * Fix some priority bugs of eval hook. * Fix some bugs about model loading and eval hook. * Add ade20k 640x640 dataset. * Fix related segformer configs. * Depreciated efficient_test. * Fix training progress blocked by eval hook. * Depreciated old test api. * Modify error patch size. * Fix pretrain of mit_b0 * Fix the test api error. * Modify dataset base config. * Fix test api error. * Modify outer api. * Build a sampler test api. * TODO: Refactor format_results. * Modify variable names. * Fix num_classes bug. * Fix sampler index bug. * Fix grammaly bug. * Add part of benchmark results. * Support batch sampler. * More readable test api. * Remove some command arg and fix eval hook bug. * Support format-only arg. * Modify format_results of datasets. * Modify tool which use test apis. * Update readme. * Update readme of segformer. * Updata readme of segformer. * Update segformer readme and fix segformer mit_b4. * Update readme of segformer. * Clean AlignedResize related config. * Clean code from pr #709 * Clean code from pr #709 * Add 512x512 segformer_mit-b5. * Fix lint. * Fix some segformer head bugs. * Add segformer unit tests. * Replace AlignedResize to ResizeToMultiple. * Modify readme of segformer. * Fix bug of ResizeToMultiple. * Add ResizeToMultiple unit tests. * Resolve conflict. * Simplify the implementation of ResizeToMultiple. * Update test results. * Fix multi-scale test error when resize_ratio=1.75 and input size=640x640. * Update segformer results. * Update Segformer results. * Fix some url bugs and pipelines bug. * Move ckpt convertion to tools. * Add segformer official pretrain weights usage. * Clean redundant codes. * Remove redundant codes. * Unfied format. * Add description for segformer converter. * Update workers.	2021-08-13 13:31:19 +08:00
sennnnn	095ed243c0	[Feature] Segformer backbone re-implementation (#594 ) * [Feature]Segformer re-implementation * Using act_cfg and norm_cfg to control activation and normalization * Split this PR into several little PRs * Fix lint error * Remove SegFormerHead * parameters init refactor * 1. Refactor segformer backbone parameters init; 2. Remove rebundant functions and unit tests; * Remove rebundant codes * 1. Remove rebundant codes; 2. Modify module name; * Refactor the backbone of segformer using mmcv.cnn.bricks.transformer.py * Fix some code logic bugs. * Add mit_convert.py to match pretrain keys of segformer. * Resolve some comments. * 1. Add some assert to ensure right params; 2. Support flexible peconv position; * Add pe_index assert and fix unit test. * 1. Add doc string for MixVisionTransformer; 2. Add some unit tests for MixVisionTransformer; * Use hw_shape to pass shape of feature map. * 1. Fix doc string of MixVisionTransformer; 2. Simplify MixFFN; 3. Modify H, W to hw_shape; * Add more unit tests. * Add doc string for shape convertion functions. * Add some unit tests to improve code coverage. * Fix Segformer backbone pretrain weights match bug. * resolve the shape convertion functions doc string. * Add pad_to_patch_size arg. * Modify default value of pad_to_patch_size arg.	2021-07-19 09:40:40 -07:00
Ze Liu	214d083cce	[WIP] Add Swin Transformer (#511 ) * add Swin Transformer * add Swin Transformer * fixed import * Add some swin training settings. * Fix some filename error. * Fix attribute name: pretrain -> pretrained * Upload mmcls implementation of swin transformer. * Refactor Swin Transformer to follow mmcls style. * Refactor init_weigths of swin_transformer.py * Fix lint * Match inference precision * Add some comments * Add swin_convert to load official style ckpt * Remove arg: auto_pad * 1. Complete comments for each block; 2. Correct weight convert function; 3. Fix the pad of Patch Merging; * Clean function args. * Fix vit unit test. * 1. Add swin transformer unit tests; 2. Fix some pad bug; 3. Modify config to adapt new swin implementation; * Modify config arg * Update readme.md of swin * Fix config arg error and Add some swin benchmark msg. * Add MeM and ms test content for readme.md of swin transformer. * Fix doc string of swin module * 1. Register swin transformer to model list; 2. Modify pth url which keep meta attribute; * Update swin.py * Merge config settings. * Modify config style. * Update README.md Add ViT link * Modify main readme.md Co-authored-by: Jiarui XU <xvjiarui0826@gmail.com> Co-authored-by: sennnnn <201730271412@mail.scut.edu.cn> Co-authored-by: Junjun2016 <hejunjun@sjtu.edu.cn>	2021-07-01 23:41:55 +08:00
Sixiao Zheng	5876868a48	[Feature] Official implementation of SETR (#531 ) * Adjust vision transformer backbone architectures; * Add DropPath, trunc_normal_ for VisionTransformer implementation; * Add class token buring intermediate period and remove it during final period; * Fix some parameters loss bug; * * Store intermediate token features and impose no processes on them; * Remove class token and reshape entire token feature from NLC to NCHW; * Fix some doc error * Add a arg for VisionTransformer backbone to control if input class token into transformer; * Add stochastic depth decay rule for DropPath; * * Fix output bug when input_cls_token=False; * Add related unit test; * Re-implement of SETR * Add two head -- SETRUPHead (Naive, PUP) & SETRMLAHead (MLA); * * Modify some docs of heads of SETR; * Add MLA auxiliary head of SETR; * * Modify some arg of setr heads; * Add unit test for setr heads; * * Add 768x768 cityscapes dataset config; * Add Backbone: SETR -- Backbone: MLA, PUP, Naive; * Add SETR cityscapes training & testing config; * * Fix the low code coverage of unit test about heads of setr; * Remove some rebundant error capture; * * Add pascal context dataset & ade20k dataset config; * Modify auxiliary head relative config; * Modify folder structure. * add setr * modify vit * Fix the test_cfg arg position; * Fix some learning schedule bug; * optimize setr code * Add arg: final_reshape to control if converting output feature information from NLC to NCHW; * Fix the default value of final_reshape; * Modify arg: final_reshape to arg: out_shape; * Fix some unit test bug; * Add MLA neck; * Modify setr configs to add MLA neck; * Modify MLA decode head to remove rebundant structure; * Remove some rebundant files. * * Fix the code style bug; * Remove some rebundant files; * Modify some unit tests of SETR; * Ignoring CityscapesCoarseDataset and MapillaryDataset. * Fix the activation function loss bug; * Fix the img_size bug of SETR_PUP_ADE20K * * Fix the lint bug of transformers.py; * Add mla neck unit test; * Convert vit of setr out shape from NLC to NCHW. * * Modify Resize action of data pipeline; * Fix deit related bug; * Set find_unused_parameters=False for pascal context dataset; * Remove arg: find_unused_parameters which is False by default. * Error auxiliary head of PUP deit * Remove the minimal restrict of slide inference. * Modify doc string of Resize * Seperate this part of code to a new PR #544 * * Remove some rebundant codes; * Modify unit tests of SETR heads; * Fix the tuple in_channels of mla_deit. * Modify code style * Move detailed definition of auxiliary head into model config dict; * Add some setr config for default cityscapes.py; * Fix the doc string of SETR head; * Modify implementation of SETR Heads * Remove setr aux head and use fcn head to replace it; * Remove arg: img_size and remove last interpolate op of heads; * Rename arg: conv3x3_conv1x1 to kernel_size of SETRUPHead; * non-square input support for setr heads * Modify config argument for above commits * Remove norm_layer argument of SETRMLAHead * Add mla_align_corners for MLAModule interpolate * [Refactor]Refactor of SETRMLAHead * Modify Head implementation; * Modify Head unit test; * Modify related config file; * [Refactor]MLA Neck * Fix config bug * [Refactor]SETR Naive Head and SETR PUP Head * [Fix]Fix the lack of arg: act_cfg and arg: norm_cfg * Fix config error * Refactor of SETR MLA, Naive, PUP heads. * Modify some attribute name of SETR Heads. * Modify setr configs to adapt new vit code. * Fix trunc_normal_ bug * Parameters init adjustment. * Remove redundant doc string of SETRUPHead * Fix pretrained bug * [Fix] Fix vit init bug * Add some vit unit tests * Modify module import * Remove norm from PatchEmbed * Fix pretrain weights bug * Modify pretrained judge * Fix some gradient backward bugs. * Add some unit tests to improve code cov * Fix init_weights of setr up head * Add DropPath in FFN * Finish benchmark of SETR 1. Add benchmark information into README.MD of SETR; 2. Fix some name bugs of vit; * Remove DropPath implementation and use DropPath from mmcv. * Modify out_indices arg * Fix out_indices bug. * Remove cityscapes base dataset config. Co-authored-by: sennnnn <201730271412@mail.scut.edu.cn> Co-authored-by: CuttlefishXuan <zhaoxinxuan1997@gmail.com>	2021-06-23 09:39:29 -07:00
sennnnn	c01abb4f30	[Refactor] Using mmcv transformer bricks to refactor vit. (#571 ) * [Refactor] Using mmcv bricks to refactor vit * Follow the vit code structure from mmclassification * Add MMCV install into CI system. * Add to 'Install MMCV' CI item * Add 'Install MMCV_CPU' and 'Install MMCV_GPU CI' items * Fix & Add 1. Fix low code coverage of vit.py; 2. Remove HybirdEmbed; 3. Fix doc string of VisionTransformer; * Add helpers unit test. * Add converter to convert vit pretrain weights from timm style to mmcls style. * Clean some rebundant code and refactor init 1. Use timm style init_weights; 2. Remove to_xtuple and trunc_norm_; * Add comments for VisionTransformer.init_weights() * Add arg: pretrain_style to choose timm or mmcls vit pretrain weights.	2021-06-17 10:41:25 -07:00
sennnnn	c27ef91942	Adjust vision transformer backbone architectures (#524 ) * Adjust vision transformer backbone architectures; * Add DropPath, trunc_normal_ for VisionTransformer implementation; * Add class token buring intermediate period and remove it during final period; * Fix some parameters loss bug; * * Store intermediate token features and impose no processes on them; * Remove class token and reshape entire token feature from NLC to NCHW; * Fix some doc error * Add a arg for VisionTransformer backbone to control if input class token into transformer; * Add stochastic depth decay rule for DropPath; * * Fix output bug when input_cls_token=False; * Add related unit test; * * Add arg: out_indices to control model output; * Add unit test for DropPath; * Apply suggestions from code review Co-authored-by: Jerry Jiarui XU <xvjiarui0826@gmail.com>	2021-04-30 10:37:47 -07:00
Jerry Jiarui XU	3150dd0ce4	refactor test organization (#440 ) * refactor test organization * fixed se layer * update mmcv uper bound	2021-03-30 17:55:09 -07:00
yamengxi	25d8d77fab	[New model] Support MobileNetV3 (#268 ) * delete markdownlint * Support MobileNetV3 * fix import * add mobilenetv3 head and configs * Modify MobileNetV3 to semantic segmentation version * modify mobilenetv3 configs * add std configs * fix Conv2dAdaptivePadding bug * add configs * add unitest and fix bugs * fix lraspp unitest bugs * restore * fix unitest * add MobileNetV3 docstring * add mmcv * add mmcv * fix syntax bug * fix unitest bug * fix unitest bug * fix unitest bugs * fix docstring * add configs * restore * delete unnecessary assert * modify unitest * delete benchmark	2020-12-26 00:02:50 -08:00
Junjun2016	5956451014	add unet (#161 ) * add unet * add unet * add unet * update test_unet * update test_unet * update test_unet * update test_unet * fix bugs * add init method for unet * add test of UNet init_weights method * add registry * merge upsample * fix test * Update mmseg/models/backbones/unet.py Co-authored-by: Jerry Jiarui XU <xvjiarui0826@gmail.com> * Update mmseg/models/backbones/unet.py Co-authored-by: Jerry Jiarui XU <xvjiarui0826@gmail.com> * split UpConvBlock from UNet * use reversed * rename upsample module * rename upsample module * rename upsample module * rename upsample module Co-authored-by: Jerry Jiarui XU <xvjiarui0826@gmail.com>	2020-10-21 11:24:38 -07:00
Jerry Jiarui XU	2610a11981	[Enhance] Refactor inverted residual (#164 ) * [Enhance] Unifed InvertedResidual in MobileNetV2 and FastSCNN * [Enhance] Unifed InvertedResidual in MobileNetV2 and FastSCNN	2020-09-28 00:33:51 +08:00
Jerry Jiarui XU	1fbb537958	[Feature] Support MobileNetV2 backbone (#86 ) * [Feature] Support MobileNetV2 backbone * Fixed import * Fixed test * Fixed test * Fixed dilate * upload model * update table * update table * update bibtex * update MMCV requirement	2020-09-04 15:35:52 +08:00
Jiarui XU	b2724da80b	init commit	2020-07-10 02:39:01 +08:00

19 Commits (608e319eb6864f393b26ae43189fd3415d195873)