Imagenet-s dataset for large-scale semantic segmentation (#2480)

## Motivation

Based on the ImageNet dataset, we propose the ImageNet-S dataset has 1.2 million training images and 50k high-quality semantic segmentation annotations to support unsupervised/semi-supervised semantic segmentation on the ImageNet dataset.

paper:
Large-scale Unsupervised Semantic Segmentation (TPAMI 2022)
[Paper link](https://arxiv.org/abs/2106.03149)

## Modification

1. Support imagenet-s dataset and its' configuration
2. Add the dataset preparation in the documentation
This commit is contained in:
Shanghua Gao 2023-01-16 16:42:19 +08:00 committed by GitHub
parent ba7608cefe
commit 6cb7fe0c51
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
6 changed files with 1165 additions and 1 deletions

View File

@ -188,6 +188,7 @@ Supported datasets:
- [x] [Vaihingen](https://github.com/open-mmlab/mmsegmentation/blob/master/docs/en/dataset_prepare.md#isprs-vaihingen)
- [x] [iSAID](https://github.com/open-mmlab/mmsegmentation/blob/master/docs/en/dataset_prepare.md#isaid)
- [x] [High quality synthetic face occlusion](https://github.com/open-mmlab/mmsegmentation/blob/master/docs/en/dataset_prepare.md#delving-into-high-quality-synthetic-face-occlusion-segmentation-datasets)
- [x] [ImageNetS](https://github.com/open-mmlab/mmsegmentation/blob/master/docs/en/dataset_prepare.md#imagenets)
## FAQ

View File

@ -0,0 +1,61 @@
# dataset settings
dataset_type = 'ImageNetSDataset'
subset = 919
data_root = 'data/ImageNetS/ImageNetS919'
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
crop_size = (224, 224)
train_pipeline = [
dict(type='LoadImageNetSImageFromFile', downsample_large_image=True),
dict(type='LoadImageNetSAnnotations', reduce_zero_label=False),
dict(type='Resize', img_scale=(1024, 256), ratio_range=(0.5, 2.0)),
dict(
type='RandomCrop',
crop_size=crop_size,
cat_max_ratio=0.75,
ignore_index=1000),
dict(type='RandomFlip', prob=0.5),
dict(type='PhotoMetricDistortion'),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=1000),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_semantic_seg']),
]
test_pipeline = [
dict(type='LoadImageNetSImageFromFile', downsample_large_image=True),
dict(
type='MultiScaleFlipAug',
img_scale=(1024, 256),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(type='Normalize', **img_norm_cfg),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img']),
])
]
data = dict(
samples_per_gpu=4,
workers_per_gpu=4,
train=dict(
type=dataset_type,
subset=subset,
data_root=data_root,
img_dir='train-semi',
ann_dir='train-semi-segmentation',
pipeline=train_pipeline),
val=dict(
type=dataset_type,
subset=subset,
data_root=data_root,
img_dir='validation',
ann_dir='validation-segmentation',
pipeline=test_pipeline),
test=dict(
type=dataset_type,
subset=subset,
data_root=data_root,
img_dir='validation',
ann_dir='validation-segmentation',
pipeline=test_pipeline))

View File

@ -155,6 +155,25 @@ mmsegmentation
│ │ │ ├── img
│ │ │ ├── mask
│ │ │ ├── split
│ ├── ImageNetS
│ │ ├── ImageNetS919
│ │ │ ├── train-semi
│ │ │ ├── train-semi-segmentation
│ │ │ ├── validation
│ │ │ ├── validation-segmentation
│ │ │ ├── test
│ │ ├── ImageNetS300
│ │ │ ├── train-semi
│ │ │ ├── train-semi-segmentation
│ │ │ ├── validation
│ │ │ ├── validation-segmentation
│ │ │ ├── test
│ │ ├── ImageNetS50
│ │ │ ├── train-semi
│ │ │ ├── train-semi-segmentation
│ │ │ ├── validation
│ │ │ ├── validation-segmentation
│ │ │ ├── test
```
### Cityscapes
@ -580,3 +599,31 @@ OCCLUDER_DATASET.IMG_DIR "path/to/jw93/mmsegmentation/data_materials/DTD/images"
```python
```
### ImageNetS
The ImageNet-S dataset is for [Large-scale unsupervised/semi-supervised semantic segmentation](https://arxiv.org/abs/2106.03149).
The images and annotations are available on [ImageNet-S](https://github.com/LUSSeg/ImageNet-S#imagenet-s-dataset-preparation).
```
│ ├── ImageNetS
│ │ ├── ImageNetS919
│ │ │ ├── train-semi
│ │ │ ├── train-semi-segmentation
│ │ │ ├── validation
│ │ │ ├── validation-segmentation
│ │ │ ├── test
│ │ ├── ImageNetS300
│ │ │ ├── train-semi
│ │ │ ├── train-semi-segmentation
│ │ │ ├── validation
│ │ │ ├── validation-segmentation
│ │ │ ├── test
│ │ ├── ImageNetS50
│ │ │ ├── train-semi
│ │ │ ├── train-semi-segmentation
│ │ │ ├── validation
│ │ │ ├── validation-segmentation
│ │ │ ├── test
```

View File

@ -119,6 +119,25 @@ mmsegmentation
│ │ ├── ann_dir
│ │ │ ├── train
│ │ │ ├── val
│ ├── ImageNetS
│ │ ├── ImageNetS919
│ │ │ ├── train-semi
│ │ │ ├── train-semi-segmentation
│ │ │ ├── validation
│ │ │ ├── validation-segmentation
│ │ │ ├── test
│ │ ├── ImageNetS300
│ │ │ ├── train-semi
│ │ │ ├── train-semi-segmentation
│ │ │ ├── validation
│ │ │ ├── validation-segmentation
│ │ │ ├── test
│ │ ├── ImageNetS50
│ │ │ ├── train-semi
│ │ │ ├── train-semi-segmentation
│ │ │ ├── validation
│ │ │ ├── validation-segmentation
│ │ │ ├── test
```
### Cityscapes
@ -317,3 +336,31 @@ python tools/convert_datasets/isaid.py /path/to/iSAID
```
使用我们默认的配置 (`patch_width`=896, `patch_height`=896, `overlap_area`=384) 将生成 33978 张图片的训练集和 11644 张图片的验证集。
### ImageNetS
ImageNet-S是用于[大规模无监督/半监督语义分割](https://arxiv.org/abs/2106.03149)任务的数据集。
ImageNet-S数据集可在[ImageNet-S](https://github.com/LUSSeg/ImageNet-S#imagenet-s-dataset-preparation)获取。
```
│ ├── ImageNetS
│ │ ├── ImageNetS919
│ │ │ ├── train-semi
│ │ │ ├── train-semi-segmentation
│ │ │ ├── validation
│ │ │ ├── validation-segmentation
│ │ │ ├── test
│ │ ├── ImageNetS300
│ │ │ ├── train-semi
│ │ │ ├── train-semi-segmentation
│ │ │ ├── validation
│ │ │ ├── validation-segmentation
│ │ │ ├── test
│ │ ├── ImageNetS50
│ │ │ ├── train-semi
│ │ │ ├── train-semi-segmentation
│ │ │ ├── validation
│ │ │ ├── validation-segmentation
│ │ │ ├── test
```

View File

@ -11,6 +11,8 @@ from .dataset_wrappers import (ConcatDataset, MultiImageMixDataset,
from .drive import DRIVEDataset
from .face import FaceOccludedDataset
from .hrf import HRFDataset
from .imagenets import (ImageNetSDataset, LoadImageNetSAnnotations,
LoadImageNetSImageFromFile)
from .isaid import iSAIDDataset
from .isprs import ISPRSDataset
from .loveda import LoveDADataset
@ -27,5 +29,7 @@ __all__ = [
'PascalContextDataset59', 'ChaseDB1Dataset', 'DRIVEDataset', 'HRFDataset',
'STAREDataset', 'DarkZurichDataset', 'NightDrivingDataset',
'COCOStuffDataset', 'LoveDADataset', 'MultiImageMixDataset',
'iSAIDDataset', 'ISPRSDataset', 'PotsdamDataset', 'FaceOccludedDataset'
'iSAIDDataset', 'ISPRSDataset', 'PotsdamDataset', 'FaceOccludedDataset',
'ImageNetSDataset', 'LoadImageNetSAnnotations',
'LoadImageNetSImageFromFile'
]

1004
mmseg/datasets/imagenets.py Normal file

File diff suppressed because it is too large Load Diff