Thanks for your contribution and we appreciate it a lot. The following
instructions would make your pull request more healthy and more easily
get feedback. If you do not understand some items, don't worry, just
make the pull request and seek help from maintainers.
## Motivation
Support inference and visualization of VPD
## Modification
1. add a new VPD model that does not generate black border in
predictions
2. update `SegLocalVisualizer` to support depth visualization
3. update `MMSegInferencer` to support save predictions of depth
estimation in method `postprocess`
## BC-breaking (Optional)
Does the modification introduce changes that break the
backward-compatibility of the downstream repos?
If so, please describe how it breaks the compatibility and how the
downstream projects should modify their code to keep compatibility with
this PR.
## Use cases (Optional)
Run inference with VPD using the this command
```sh
python demo/image_demo_with_inferencer.py demo/classroom__rgb_00283.jpg vpd_depth --out-dir vis_results
```
The following image will be saved under `vis_results/vis`

## Checklist
1. Pre-commit or other linting tools are used to fix the potential lint
issues.
4. The modification is covered by complete unit tests. If not, please
add more unit test to ensure the correctness.
5. If the modification has potential influence on downstream projects,
this PR should be tested with downstream projects, like MMDet or
MMDet3D.
6. The documentation has been modified accordingly, like docstring or
example tutorials.
Thanks for your contribution and we appreciate it a lot. The following
instructions would make your pull request more healthy and more easily
get feedback. If you do not understand some items, don't worry, just
make the pull request and seek help from maintainers.
## Motivation
Support depth estimation algorithm [VPD](https://github.com/wl-zhao/VPD)
## Modification
1. add VPD backbone
2. add VPD decoder head for depth estimation
3. add a new segmentor `DepthEstimator` based on `EncoderDecoder` for
depth estimation
4. add an integrated metric that calculate common metrics in depth
estimation
5. add SiLog loss for depth estimation
6. add config for VPD
## BC-breaking (Optional)
Does the modification introduce changes that break the
backward-compatibility of the downstream repos?
If so, please describe how it breaks the compatibility and how the
downstream projects should modify their code to keep compatibility with
this PR.
## Use cases (Optional)
If this PR introduces a new feature, it is better to list some use cases
here, and update the documentation.
## Checklist
1. Pre-commit or other linting tools are used to fix the potential lint
issues.
7. The modification is covered by complete unit tests. If not, please
add more unit test to ensure the correctness.
8. If the modification has potential influence on downstream projects,
this PR should be tested with downstream projects, like MMDet or
MMDet3D.
9. The documentation has been modified accordingly, like docstring or
example tutorials.
## Motivation
Change the dependency `mmcls` to `mmpretrain`
## Modification
- modify `mmcls` to `mmpretrain`
- modify CI requirements
## BC-breaking (Optional)
If users have installed mmcls but not install mmpretrain, it might raise some error.
## Motivation
For support with reading multiple remote sensing image formats, please
refer to https://gdal.org/drivers/raster/index.html.
Byte, UInt16, Int16, UInt32, Int32, Float32, Float64, CInt16, CInt32,
CFloat32 and CFloat64 are supported for reading and writing.
Support input of two images for change detection tasks, and support the
LEVIR-CD dataset.
## Modification
Add LoadSingleRSImageFromFile in 'mmseg/datasets/transforms/loading.py'.
Load a single remote sensing image for object segmentation tasks.
Add LoadMultipleRSImageFromFile in
'mmseg/datasets/transforms/loading.py'.
Load two remote sensing images for change detection tasks.
Add ConcatCDInput in 'mmseg/datasets/transforms/transforms.py'.
Combine images that have been separately augmented for data enhancement.
Add BaseCDDataset in 'mmseg/datasets/basesegdataset.py'
Base class for datasets used in change detection tasks.
---------
Co-authored-by: xiexinch <xiexinch@outlook.com>
## Motivation
In MMEngine >= 0.2.0, it might directly determine what the backend is by
using the `data_root` path.
## Modification
Set all default `backend_args` values are `None`.
## Motivation
The DETR-related modules have been refactored in
open-mmlab/mmdetection#8763, which causes breakings of MaskFormer and
Mask2Former in both MMDetection (has been fixed in
open-mmlab/mmdetection#9515) and MMSegmentation. This pr fix the bugs in
MMSegmentation.
### TO-DO List
- [x] update configs
- [x] check and modify data flow
- [x] fix unit test
- [x] aligning inference
- [x] write a ckpt converter
- [x] write ckpt update script
- [x] update model zoo
- [x] update model link in readme
- [x] update
[faq.md](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/docs/en/notes/faq.md#installation)
## Tips of Fixing other implementations based on MaskXFormer of mmseg
1. The Transformer modules should be built directly. The original
building with register manner has been refactored.
2. The config requires to be modified. Delete `type` and modify several
keys, according to the modifications in this pr.
3. The `batch_first` is set `True` uniformly in the new implementations.
Hence the data flow requires to be transposed and config of
`batch_first` needs to be modified.
4. The checkpoint trained on the old implementation should be converted
to be used in the new one.
### Convert script
```Python
import argparse
from copy import deepcopy
from collections import OrderedDict
import torch
from mmengine.config import Config
from mmseg.models import build_segmentor
from mmseg.utils import register_all_modules
register_all_modules(init_default_scope=True)
def parse_args():
parser = argparse.ArgumentParser(
description='MMSeg convert MaskXFormer model, by Li-Qingyun')
parser.add_argument('Mask_what_former', type=int,
help='Mask what former, can be a `1` or `2`',
choices=[1, 2])
parser.add_argument('CFG_FILE', help='config file path')
parser.add_argument('OLD_CKPT_FILEPATH', help='old ckpt file path')
parser.add_argument('NEW_CKPT_FILEPATH', help='new ckpt file path')
args = parser.parse_args()
return args
args = parse_args()
def get_new_name(old_name: str):
new_name = old_name
if 'encoder.layers' in new_name:
new_name = new_name.replace('attentions.0', 'self_attn')
new_name = new_name.replace('ffns.0', 'ffn')
if 'decoder.layers' in new_name:
if args.Mask_what_former == 2:
# for Mask2Former
new_name = new_name.replace('attentions.0', 'cross_attn')
new_name = new_name.replace('attentions.1', 'self_attn')
else:
# for Mask2Former
new_name = new_name.replace('attentions.0', 'self_attn')
new_name = new_name.replace('attentions.1', 'cross_attn')
return new_name
def cvt_sd(old_sd: OrderedDict):
new_sd = OrderedDict()
for name, param in old_sd.items():
new_name = get_new_name(name)
assert new_name not in new_sd
new_sd[new_name] = param
assert len(new_sd) == len(old_sd)
return new_sd
if __name__ == '__main__':
cfg = Config.fromfile(args.CFG_FILE)
model_cfg = cfg.model
segmentor = build_segmentor(model_cfg)
refer_sd = segmentor.state_dict()
old_ckpt = torch.load(args.OLD_CKPT_FILEPATH)
old_sd = old_ckpt['state_dict']
new_sd = cvt_sd(old_sd)
print(segmentor.load_state_dict(new_sd))
new_ckpt = deepcopy(old_ckpt)
new_ckpt['state_dict'] = new_sd
torch.save(new_ckpt, args.NEW_CKPT_FILEPATH)
print(f'{args.NEW_CKPT_FILEPATH} has been saved!')
```
Usage:
```bash
# for example
python ckpt4pr2532.py 1 configs/maskformer/maskformer_r50-d32_8xb2-160k_ade20k-512x512.py original_ckpts/maskformer_r50-d32_8xb2-160k_ade20k-512x512_20221030_182724-cbd39cc1.pth cvt_outputs/maskformer_r50-d32_8xb2-160k_ade20k-512x512_20221030_182724.pth
python ckpt4pr2532.py 2 configs/mask2former/mask2former_r50_8xb2-160k_ade20k-512x512.py original_ckpts/mask2former_r50_8xb2-160k_ade20k-512x512_20221204_000055-4c62652d.pth cvt_outputs/mask2former_r50_8xb2-160k_ade20k-512x512_20221204_000055.pth
```
---------
Co-authored-by: MeowZheng <meowzheng@outlook.com>
## Motivation
Use the new fileio from mmengine
https://github.com/open-mmlab/mmengine/pull/533
## Modification
1. Use `mmengine.fileio` to repalce FileClient in mmseg/datasets
2. Use `mmengine.fileio` to repalce FileClient in
mmseg/datasets/transforms
3. Use `mmengine.fileio` to repalce FileClient in mmseg/visualization
## BC-breaking (Optional)
we modify all the dataset configurations, so please use the latest config file.
* tta init
* use mmcv transform
* test city
* add multiscale
* fix merge
* add softmax to post process
* add ut
* add tta pipeline to other datasets
* remove softmax
* add encoder_decoder_tta ut
* add encoder_decoder_tta ut
* rename
* rename file
* rename config
* rm aug_test
* move flip to post process
* fix channel