Commit Graph

19 Commits (608e319eb6864f393b26ae43189fd3415d195873)

Author SHA1 Message Date
angiecao 608e319eb6
[Feature] Support Side Adapter Network (#3232)
## Motivation
Support SAN for Open-Vocabulary Semantic Segmentation
Paper: [Side Adapter Network for Open-Vocabulary Semantic
Segmentation](https://arxiv.org/abs/2302.12242)
official Code: [SAN](https://github.com/MendelXu/SAN)

## Modification
- Added the parameters of backbone vit for implementing the image
encoder of CLIP.
- Added text encoder code.
- Added segmentor multimodel encoder-decoder code for open-vocabulary
semantic segmentation.
- Added SideAdapterNetwork decode head code.
- Added config files for train and inference.
- Added tools for converting pretrained models.
- Added loss implementation for mask classification model, such as SAN,
Maskformer and remove dependency on mmdetection.
- Added test units for text encoder, multimodel encoder-decoder, san
decode head and hungarian_assigner.

## Use cases
### Convert Models
**pretrained SAN model**
The official pretrained model can be downloaded from
[san_clip_vit_b_16.pth](https://huggingface.co/Mendel192/san/blob/main/san_vit_b_16.pth)
and
[san_clip_vit_large_14.pth](https://huggingface.co/Mendel192/san/blob/main/san_vit_large_14.pth).
Use tools/model_converters/san2mmseg.py to convert offcial model into
mmseg style.
`python tools/model_converters/san2mmseg.py <MODEL_PATH> <OUTPUT_PATH>`

**pretrained CLIP model**
Use the CLIP model provided by openai to train SAN. The CLIP model can
be download from
[ViT-B-16.pt](https://openaipublic.azureedge.net/clip/models/5806e77cd80f8b59890b7e101eabd078d9fb84e6937f9e85e4ecb61988df416f/ViT-B-16.pt)
and
[ViT-L-14-336px.pt](https://openaipublic.azureedge.net/clip/models/3035c92b350959924f9f00213499208652fc7ea050643e8b385c2dac08641f02/ViT-L-14-336px.pt).
Use tools/model_converters/clip2mmseg.py to convert model into mmseg
style.
`python tools/model_converters/clip2mmseg.py <MODEL_PATH> <OUTPUT_PATH>`

### Inference
test san_vit-base-16 model on coco-stuff164k dataset
`python tools/test.py
./configs/san/san-vit-b16_coco-stuff164k-640x640.py
<TRAINED_MODEL_PATH>`

### Train
test san_vit-base-16 model on coco-stuff164k dataset
`python tools/train.py
./configs/san/san-vit-b16_coco-stuff164k-640x640.py --cfg-options
model.pretrained=<PRETRAINED_MODEL_PATH>`

## Comparision Results
### Train on COCO-Stuff164k
|                 |       | mIoU  | mAcc  | pAcc  |
| --------------- | ----- | ----- | ----- | ----- |
| san-vit-base16  | official  | 41.93 | 56.73 | 67.69 |
|                 | mmseg | 41.93 | 56.84 | 67.84 |
| san-vit-large14 | official  | 45.57 | 59.52 | 69.76 |
|                 | mmseg | 45.78 | 59.61 | 69.21 |

### Evaluate on Pascal Context
|                 |       | mIoU  | mAcc  | pAcc  |
| --------------- | ----- | ----- | ----- | ----- |
| san-vit-base16  | official  | 54.05 | 72.96 | 77.77 |
|                 | mmseg | 54.04 | 73.74 | 77.71 |
| san-vit-large14 | official  | 57.53 | 77.56 | 78.89 |
|                 | mmseg | 56.89 | 76.96 | 78.74 |

### Evaluate on Voc12Aug
|                 |       | mIoU  | mAcc  | pAcc  |
| --------------- | ----- | ----- | ----- | ----- |
| san-vit-base16  | official  | 93.86 | 96.61 | 97.11 |
|                 | mmseg | 94.58 | 97.01 | 97.38 |
| san-vit-large14 | official  | 95.17 | 97.61 | 97.63 |
|                 | mmseg | 95.58 | 97.75 | 97.79 |

---------

Co-authored-by: CastleDream <35064479+CastleDream@users.noreply.github.com>
Co-authored-by: yeedrag <46050186+yeedrag@users.noreply.github.com>
Co-authored-by: Yang-ChangHui <71805205+Yang-Changhui@users.noreply.github.com>
Co-authored-by: Xu CAO <49406546+SheffieldCao@users.noreply.github.com>
Co-authored-by: xiexinch <xiexinch@outlook.com>
Co-authored-by: 小飞猪 <106524776+ooooo-create@users.noreply.github.com>
2023-09-20 21:20:26 +08:00
谢昕辰 dd47cef801
[Feature] Support PIDNet (#2609)
## Motivation

Support SOTA real-time semantic segmentation method in [Paper with
code](https://paperswithcode.com/task/real-time-semantic-segmentation)

Paper: https://arxiv.org/pdf/2206.02066.pdf
Official repo: https://github.com/XuJiacong/PIDNet

## Current results

**Cityscapes**

|Model|Ref mIoU|mIoU (ours)|
|---|---|---|
|PIDNet-S|78.8|78.74|
|PIDNet-M|79.9|80.22|
|PIDNet-L|80.9|80.89|

## TODO

- [x] Support inference with official weights
- [x] Support training on Cityscapes
- [x] Update docstring
- [x] Add unit test
2023-03-15 14:55:30 +08:00
Miao Zheng e0499d5a77 [Fix] Fix repo based on refactoring standard (#1869)
* [Fix] Fix repo based on refactory standard

* fix ut
2022-08-19 20:50:02 +08:00
Rockey 42df28c7bd [Feature] add nlc2nchw2nlc and nchw2nlc2nchw (#1249)
* [Feature] add nlc2nchw2nlc and nchw2nlc2nchw

* add example

* add test, add **kwargs to make it more universal
2022-03-10 20:27:28 +08:00
Miao Zheng b97cfa77d2 [Enhancement] Revise pre-commit-hooks (#1315) 2022-02-23 23:44:27 +08:00
谢昕辰 119bbd838d [Enhancement] Delete convert function and add instruction to ViT/Swin README.md (#791)
* delete convert function and add instruction to README.md

* unified model convert and README

* remove url

* fix import error

* fix unittest

* rename pretrain

* rename vit and deit pretrain

* Update upernet_deit-b16_512x512_160k_ade20k.py

* Update upernet_deit-b16_512x512_80k_ade20k.py

* Update upernet_deit-b16_ln_mln_512x512_160k_ade20k.py

* Update upernet_deit-b16_mln_512x512_160k_ade20k.py

* Update upernet_deit-s16_512x512_160k_ade20k.py

* Update upernet_deit-s16_512x512_80k_ade20k.py

* Update upernet_deit-s16_ln_mln_512x512_160k_ade20k.py

* Update upernet_deit-s16_mln_512x512_160k_ade20k.py

Co-authored-by: Jiarui XU <xvjiarui0826@gmail.com>
Co-authored-by: Junjun2016 <hejunjun@sjtu.edu.cn>
2021-08-25 15:00:41 -07:00
Junjun2016 2fe0bddf5e [Dcos] Add header for files (#796)
* Add header for files

* Delete header in config files
2021-08-16 23:16:55 -07:00
sennnnn b4fd32d049 [Feature] Add segformer decode head and related train config (#599)
* [Feature]Segformer re-implementation

* Using act_cfg and norm_cfg to control activation and normalization

* Split this PR into several little PRs

* Fix lint error

* Remove SegFormerHead

* [Feature] Add segformer decode head and related train config

* Add ade20K trainval support for segformer

1. Add related train and val configs;

2. Add AlignedResize;

* Set arg: find_unused_parameters = True

* parameters init refactor

* 1. Refactor segformer backbone parameters init;

2. Remove rebundant functions and unit tests;

* Remove rebundant codes

* Replace Linear Layer to 1X1 Conv

* Use nn.ModuleList to refactor segformer head.

* Remove local to_xtuple

* 1. Remove rebundant codes;

2. Modify module name;

* Refactor the backbone of segformer using mmcv.cnn.bricks.transformer.py

* Fix some code logic bugs.

* Add mit_convert.py to match pretrain keys of segformer.

* Resolve some comments.

* 1. Add some assert to ensure right params;

2. Support flexible peconv position;

* Add pe_index assert and fix unit test.

* 1. Add doc string for MixVisionTransformer;

2. Add some unit tests for MixVisionTransformer;

* Use hw_shape to pass shape of feature map.

* 1. Fix doc string of MixVisionTransformer;

2. Simplify MixFFN;

3. Modify H, W to hw_shape;

* Add more unit tests.

* Add doc string for shape convertion functions.

* Add some unit tests to improve code coverage.

* Fix Segformer backbone pretrain weights match bug.

* Modify configs of segformer.

* resolve the shape convertion functions doc string.

* Add pad_to_patch_size arg.

* Support progressive test with fewer memory cost.

* Modify default value of pad_to_patch_size arg.

* Temp code

* Using processor to refactor evaluation workflow.

* refactor eval hook.

* Fix process bar.

* Fix middle save argument.

* Modify some variable name of dataset evaluate api.

* Modify some viriable name of eval hook.

* Fix some priority bugs of eval hook.

* Fix some bugs about model loading and eval hook.

* Add ade20k 640x640 dataset.

* Fix related segformer configs.

* Depreciated efficient_test.

* Fix training progress blocked by eval hook.

* Depreciated old test api.

* Modify error patch size.

* Fix pretrain of mit_b0

* Fix the test api error.

* Modify dataset base config.

* Fix test api error.

* Modify outer api.

* Build a sampler test api.

* TODO: Refactor format_results.

* Modify variable names.

* Fix num_classes bug.

* Fix sampler index bug.

* Fix grammaly bug.

* Add part of benchmark results.

* Support batch sampler.

* More readable test api.

* Remove some command arg and fix eval hook bug.

* Support format-only arg.

* Modify format_results of datasets.

* Modify tool which use test apis.

* Update readme.

* Update readme of segformer.

* Updata readme of segformer.

* Update segformer readme and fix segformer mit_b4.

* Update readme of segformer.

* Clean AlignedResize related config.

* Clean code from pr #709

* Clean code from pr #709

* Add 512x512 segformer_mit-b5.

* Fix lint.

* Fix some segformer head bugs.

* Add segformer unit tests.

* Replace AlignedResize to ResizeToMultiple.

* Modify readme of segformer.

* Fix bug of ResizeToMultiple.

* Add ResizeToMultiple unit tests.

* Resolve conflict.

* Simplify the implementation of ResizeToMultiple.

* Update test results.

* Fix multi-scale test error when resize_ratio=1.75 and input size=640x640.

* Update segformer results.

* Update Segformer results.

* Fix some url bugs and pipelines bug.

* Move ckpt convertion to tools.

* Add segformer official pretrain weights usage.

* Clean redundant codes.

* Remove redundant codes.

* Unfied format.

* Add description for segformer converter.

* Update workers.
2021-08-13 13:31:19 +08:00
sennnnn 095ed243c0 [Feature] Segformer backbone re-implementation (#594)
* [Feature]Segformer re-implementation

* Using act_cfg and norm_cfg to control activation and normalization

* Split this PR into several little PRs

* Fix lint error

* Remove SegFormerHead

* parameters init refactor

* 1. Refactor segformer backbone parameters init;

2. Remove rebundant functions and unit tests;

* Remove rebundant codes

* 1. Remove rebundant codes;

2. Modify module name;

* Refactor the backbone of segformer using mmcv.cnn.bricks.transformer.py

* Fix some code logic bugs.

* Add mit_convert.py to match pretrain keys of segformer.

* Resolve some comments.

* 1. Add some assert to ensure right params;

2. Support flexible peconv position;

* Add pe_index assert and fix unit test.

* 1. Add doc string for MixVisionTransformer;

2. Add some unit tests for MixVisionTransformer;

* Use hw_shape to pass shape of feature map.

* 1. Fix doc string of MixVisionTransformer;

2. Simplify MixFFN;

3. Modify H, W to hw_shape;

* Add more unit tests.

* Add doc string for shape convertion functions.

* Add some unit tests to improve code coverage.

* Fix Segformer backbone pretrain weights match bug.

* resolve the shape convertion functions doc string.

* Add pad_to_patch_size arg.

* Modify default value of pad_to_patch_size arg.
2021-07-19 09:40:40 -07:00
Ze Liu 214d083cce [WIP] Add Swin Transformer (#511)
* add Swin Transformer

* add Swin Transformer

* fixed import

* Add some swin training settings.

* Fix some filename error.

* Fix attribute name: pretrain -> pretrained

* Upload mmcls implementation of swin transformer.

* Refactor Swin Transformer to follow mmcls style.

* Refactor init_weigths of swin_transformer.py

* Fix lint

* Match inference precision

* Add some comments

* Add swin_convert to load official style ckpt

* Remove arg: auto_pad

* 1. Complete comments for each block;

2. Correct weight convert function;

3. Fix the pad of Patch Merging;

* Clean function args.

* Fix vit unit test.

* 1. Add swin transformer unit tests;

2. Fix some pad bug;

3. Modify config to adapt new swin implementation;

* Modify config arg

* Update readme.md of swin

* Fix config arg error and Add some swin benchmark msg.

* Add MeM and ms test content for readme.md of swin transformer.

* Fix doc string of swin module

* 1. Register swin transformer to model list;

2. Modify pth url which keep meta attribute;

* Update swin.py

* Merge config settings.

* Modify config style.

* Update README.md

Add ViT link

* Modify main readme.md

Co-authored-by: Jiarui XU <xvjiarui0826@gmail.com>
Co-authored-by: sennnnn <201730271412@mail.scut.edu.cn>
Co-authored-by: Junjun2016 <hejunjun@sjtu.edu.cn>
2021-07-01 23:41:55 +08:00
Sixiao Zheng 5876868a48 [Feature] Official implementation of SETR (#531)
* Adjust vision transformer backbone architectures;

* Add DropPath, trunc_normal_ for VisionTransformer implementation;

* Add class token buring intermediate period and remove it during final period;

* Fix some parameters loss bug;

* * Store intermediate token features and impose no processes on them;

* Remove class token and reshape entire token feature from NLC to NCHW;

* Fix some doc error

* Add a arg for VisionTransformer backbone to control if input class token into transformer;

* Add stochastic depth decay rule for DropPath;

* * Fix output bug when input_cls_token=False;

* Add related unit test;

* Re-implement of SETR

* Add two head -- SETRUPHead (Naive, PUP) & SETRMLAHead (MLA);

* * Modify some docs of heads of SETR;

* Add MLA auxiliary head of SETR;

* * Modify some arg of setr heads;

* Add unit test for setr heads;

* * Add 768x768 cityscapes dataset config;

* Add Backbone: SETR -- Backbone: MLA, PUP, Naive;

* Add SETR cityscapes training & testing config;

* * Fix the low code coverage of unit test about heads of setr;

* Remove some rebundant error capture;

* * Add pascal context dataset & ade20k dataset config;

* Modify auxiliary head relative config;

* Modify folder structure.

* add setr

* modify vit

* Fix the test_cfg arg position;

* Fix some learning schedule bug;

* optimize setr code

* Add arg: final_reshape to control if converting output feature information from NLC to NCHW;

* Fix the default value of final_reshape;

* Modify arg: final_reshape to arg: out_shape;

* Fix some unit test bug;

* Add MLA neck;

* Modify setr configs to add MLA neck;

* Modify MLA decode head to remove rebundant structure;

* Remove some rebundant files.

* * Fix the code style bug;

* Remove some rebundant files;

* Modify some unit tests of SETR;

* Ignoring CityscapesCoarseDataset and MapillaryDataset.

* Fix the activation function loss bug;

* Fix the img_size bug of SETR_PUP_ADE20K

* * Fix the lint bug of transformers.py;

* Add mla neck unit test;

* Convert vit of setr out shape from NLC to NCHW.

* * Modify Resize action of data pipeline;

* Fix deit related bug;

* Set find_unused_parameters=False for pascal context dataset;

* Remove arg: find_unused_parameters which is False by default.

* Error auxiliary head of PUP deit

* Remove the minimal restrict of slide inference.

* Modify doc string of Resize

* Seperate this part of code to a new PR #544

* * Remove some rebundant codes;

* Modify unit tests of SETR heads;

* Fix the tuple in_channels of mla_deit.

* Modify code style

* Move detailed definition of auxiliary head into model config dict;

* Add some setr config for default cityscapes.py;

* Fix the doc string of SETR head;

* Modify implementation of SETR Heads

* Remove setr aux head and use fcn head to replace it;

* Remove arg: img_size and remove last interpolate op of heads;

* Rename arg: conv3x3_conv1x1 to kernel_size of SETRUPHead;

* non-square input support for setr heads

* Modify config argument for above commits

* Remove norm_layer argument of SETRMLAHead

* Add mla_align_corners for MLAModule interpolate

* [Refactor]Refactor of SETRMLAHead

* Modify Head implementation;

* Modify Head unit test;

* Modify related config file;

* [Refactor]MLA Neck

* Fix config bug

* [Refactor]SETR Naive Head and SETR PUP Head

* [Fix]Fix the lack of arg: act_cfg and arg: norm_cfg

* Fix config error

* Refactor of SETR MLA, Naive, PUP heads.

* Modify some attribute name of SETR Heads.

* Modify setr configs to adapt new vit code.

* Fix trunc_normal_ bug

* Parameters init adjustment.

* Remove redundant doc string of SETRUPHead

* Fix pretrained bug

* [Fix] Fix vit init bug

* Add some vit unit tests

* Modify module import

* Remove norm from PatchEmbed

* Fix pretrain weights bug

* Modify pretrained judge

* Fix some gradient backward bugs.

* Add some unit tests to improve code cov

* Fix init_weights of setr up head

* Add DropPath in FFN

* Finish benchmark of SETR

1. Add benchmark information into README.MD of SETR;

2. Fix some name bugs of vit;

* Remove DropPath implementation and use DropPath from mmcv.

* Modify out_indices arg

* Fix out_indices bug.

* Remove cityscapes base dataset config.

Co-authored-by: sennnnn <201730271412@mail.scut.edu.cn>
Co-authored-by: CuttlefishXuan <zhaoxinxuan1997@gmail.com>
2021-06-23 09:39:29 -07:00
sennnnn c01abb4f30 [Refactor] Using mmcv transformer bricks to refactor vit. (#571)
* [Refactor] Using mmcv bricks to refactor vit

* Follow the vit code structure from mmclassification

* Add MMCV install into CI system.

* Add  to 'Install MMCV' CI item

* Add 'Install MMCV_CPU' and 'Install MMCV_GPU CI' items

* Fix & Add

1. Fix low code coverage of vit.py;

2. Remove HybirdEmbed;

3. Fix doc string of VisionTransformer;

* Add helpers unit test.

* Add converter to convert vit pretrain weights from timm style to mmcls style.

* Clean some rebundant code and refactor init

1. Use timm style init_weights;

2. Remove to_xtuple and trunc_norm_;

* Add comments for VisionTransformer.init_weights()

* Add arg: pretrain_style to choose timm or mmcls vit pretrain weights.
2021-06-17 10:41:25 -07:00
sennnnn c27ef91942 Adjust vision transformer backbone architectures (#524)
* Adjust vision transformer backbone architectures;

* Add DropPath, trunc_normal_ for VisionTransformer implementation;

* Add class token buring intermediate period and remove it during final period;

* Fix some parameters loss bug;

* * Store intermediate token features and impose no processes on them;

* Remove class token and reshape entire token feature from NLC to NCHW;

* Fix some doc error

* Add a arg for VisionTransformer backbone to control if input class token into transformer;

* Add stochastic depth decay rule for DropPath;

* * Fix output bug when input_cls_token=False;

* Add related unit test;

* * Add arg: out_indices to control model output;

* Add unit test for DropPath;

* Apply suggestions from code review

Co-authored-by: Jerry Jiarui XU <xvjiarui0826@gmail.com>
2021-04-30 10:37:47 -07:00
Jerry Jiarui XU 3150dd0ce4 refactor test organization (#440)
* refactor test organization

* fixed se layer

* update mmcv uper bound
2021-03-30 17:55:09 -07:00
yamengxi 25d8d77fab [New model] Support MobileNetV3 (#268)
* delete markdownlint

* Support MobileNetV3

* fix import

* add mobilenetv3 head and configs

* Modify MobileNetV3 to semantic segmentation version

* modify mobilenetv3 configs

* add std configs

* fix Conv2dAdaptivePadding bug

* add configs

* add unitest and fix bugs

* fix lraspp unitest bugs

* restore

* fix unitest

* add MobileNetV3 docstring

* add mmcv

* add mmcv

* fix syntax bug

* fix unitest bug

* fix unitest bug

* fix unitest bugs

* fix docstring

* add configs

* restore

* delete unnecessary assert

* modify unitest

* delete benchmark
2020-12-26 00:02:50 -08:00
Junjun2016 5956451014 add unet (#161)
* add unet

* add unet

* add unet

* update test_unet

* update test_unet

* update test_unet

* update test_unet

* fix bugs

* add init method for unet

* add test of UNet init_weights method

* add registry

* merge upsample

* fix test

* Update mmseg/models/backbones/unet.py

Co-authored-by: Jerry Jiarui XU <xvjiarui0826@gmail.com>

* Update mmseg/models/backbones/unet.py

Co-authored-by: Jerry Jiarui XU <xvjiarui0826@gmail.com>

* split UpConvBlock from UNet

* use reversed

* rename upsample module

* rename upsample module

* rename upsample module

* rename upsample module

Co-authored-by: Jerry Jiarui XU <xvjiarui0826@gmail.com>
2020-10-21 11:24:38 -07:00
Jerry Jiarui XU 2610a11981 [Enhance] Refactor inverted residual (#164)
* [Enhance] Unifed InvertedResidual in MobileNetV2 and FastSCNN

* [Enhance] Unifed InvertedResidual in MobileNetV2 and FastSCNN
2020-09-28 00:33:51 +08:00
Jerry Jiarui XU 1fbb537958 [Feature] Support MobileNetV2 backbone (#86)
* [Feature] Support MobileNetV2 backbone

* Fixed import

* Fixed test

* Fixed test

* Fixed dilate

* upload model

* update table

* update table

* update bibtex

* update MMCV requirement
2020-09-04 15:35:52 +08:00
Jiarui XU b2724da80b init commit 2020-07-10 02:39:01 +08:00