[Doc] Updata transforms Doc
parent
eef38883c8
commit
89f6647d1f
|
@ -1,5 +1,13 @@
|
|||
# Data Transforms
|
||||
|
||||
In this tutorial, we introduce the design of transforms pipeline in MMSegmentation.
|
||||
|
||||
The structure of this guide is as follows:
|
||||
|
||||
- [Data Transforms](#data-transforms)
|
||||
- [Design of Data pipelines](#design-of-data-pipelines)
|
||||
- [Customization data transformation](#customization-data-transformation)
|
||||
|
||||
## Design of Data pipelines
|
||||
|
||||
Following typical conventions, we use `Dataset` and `DataLoader` for data loading
|
||||
|
@ -10,6 +18,24 @@ we introduce a new `DataContainer` type in MMCV to help collect and distribute
|
|||
data of different size.
|
||||
See [here](https://github.com/open-mmlab/mmcv/blob/master/mmcv/parallel/data_container.py) for more details.
|
||||
|
||||
In 1.x version of MMSegmentation, all data transformations are inherited from `BaseTransform`.
|
||||
The input and output types of transformations are both dict. A simple example is as follow:
|
||||
|
||||
```python
|
||||
>>> from mmseg.datasets.transforms import LoadAnnotations
|
||||
>>> transforms = LoadAnnotations()
|
||||
>>> img_path = './data/cityscapes/leftImg8bit/train/aachen/aachen_000000_000019_leftImg8bit.png.png'
|
||||
>>> gt_path = './data/cityscapes/gtFine/train/aachen/aachen_000015_000019_gtFine_instanceTrainIds.png'
|
||||
>>> results = dict(
|
||||
>>> img_path=img_path,
|
||||
>>> seg_map_path=gt_path,
|
||||
>>> reduce_zero_label=False,
|
||||
>>> seg_fields=[])
|
||||
>>> data_dict = transforms(results)
|
||||
>>> print(data_dict.keys())
|
||||
dict_keys(['img_path', 'seg_map_path', 'reduce_zero_label', 'seg_fields', 'gt_seg_map'])
|
||||
```
|
||||
|
||||
The data preparation pipeline and the dataset is decomposed. Usually a dataset
|
||||
defines how to process the annotations and a data pipeline defines all the steps to prepare a data dict.
|
||||
A pipeline consists of a sequence of operations. Each operation takes a dict as input and also output a dict for the next transform.
|
||||
|
@ -43,47 +69,104 @@ test_pipeline = [
|
|||
]
|
||||
```
|
||||
|
||||
For each operation, we list the related dict fields that are added/updated/removed.
|
||||
Before pipelines, the information we can directly obtain from the datasets are img_path, seg_map_path.
|
||||
For each operation, we list the related dict fields that are `added`/`updated`/`removed`.
|
||||
Before pipelines, the information we can directly obtain from the datasets are `img_path` and `seg_map_path`.
|
||||
|
||||
### Data loading
|
||||
|
||||
`LoadImageFromFile`
|
||||
`LoadImageFromFile`: Load an image from file.
|
||||
|
||||
- add: img, img_shape, ori_shape
|
||||
- add: `img`, `img_shape`, `ori_shape`
|
||||
|
||||
`LoadAnnotations`
|
||||
`LoadAnnotations`: Load semantic segmentation maps provided by dataset.
|
||||
|
||||
- add: seg_fields, gt_seg_map
|
||||
- add: `seg_fields`, `gt_seg_map`
|
||||
|
||||
### Pre-processing
|
||||
|
||||
`RandomResize`
|
||||
`RandomResize`: Random resize image & segmentation map.
|
||||
|
||||
- add: scale, scale_factor, keep_ratio
|
||||
- update: img, img_shape, gt_seg_map
|
||||
- add: `scale`, `scale_factor`, `keep_ratio`
|
||||
- update: `img`, `img_shape`, `gt_seg_map`
|
||||
|
||||
`Resize`
|
||||
`Resize`: Resize image & segmentation map.
|
||||
|
||||
- add: scale, scale_factor, keep_ratio
|
||||
- update: img, gt_seg_map, img_shape
|
||||
- add: `scale`, `scale_factor`, `keep_ratio`
|
||||
- update: `img`, `gt_seg_map`, `img_shape`
|
||||
|
||||
`RandomCrop`
|
||||
`RandomCrop`: Random crop image & segmentation map.
|
||||
|
||||
- update: img, pad_shape, gt_seg_map
|
||||
- update: `img`, `gt_seg_map`, `img_shape`.
|
||||
|
||||
`RandomFlip`
|
||||
`RandomFlip`: Flip the image & segmentation map.
|
||||
|
||||
- add: flip, flip_direction
|
||||
- update: img, gt_seg_map
|
||||
- add: `flip`, `flip_direction`
|
||||
- update: `img`, `gt_seg_map`
|
||||
|
||||
`PhotoMetricDistortion`
|
||||
`PhotoMetricDistortion`: Apply photometric distortion to image sequentially,
|
||||
every transformation is applied with a probability of 0.5.
|
||||
The position of random contrast is in second or second to last(mode 0 or 1 below, respectively).
|
||||
|
||||
- update: img
|
||||
```
|
||||
1. random brightness
|
||||
2. random contrast (mode 0)
|
||||
3. convert color from BGR to HSV
|
||||
4. random saturation
|
||||
5. random hue
|
||||
6. convert color from HSV to BGR
|
||||
7. random contrast (mode 1)
|
||||
```
|
||||
|
||||
- update: `img`
|
||||
|
||||
### Formatting
|
||||
|
||||
`PackSegInputs`
|
||||
`PackSegInputs`: Pack the inputs data for the semantic segmentation.
|
||||
|
||||
- add: inputs, data_sample
|
||||
- add: `inputs`, `data_sample`
|
||||
- remove: keys specified by `meta_keys` (merged into the metainfo of data_sample), all other keys
|
||||
|
||||
## Customization data transformation
|
||||
|
||||
The customized data transformation must inherinted from `BaseTransform` and implement `transform` function.
|
||||
Here we use a simple flipping transformation as example:
|
||||
|
||||
```python
|
||||
import random
|
||||
import mmcv
|
||||
from mmcv.transforms import BaseTransform, TRANSFORMS
|
||||
|
||||
@TRANSFORMS.register_module()
|
||||
class MyFlip(BaseTransform):
|
||||
def __init__(self, direction: str):
|
||||
super().__init__()
|
||||
self.direction = direction
|
||||
|
||||
def transform(self, results: dict) -> dict:
|
||||
img = results['img']
|
||||
results['img'] = mmcv.imflip(img, direction=self.direction)
|
||||
return results
|
||||
```
|
||||
|
||||
Thus, we can instantiate a `MyFlip` object and use it to process the data dict.
|
||||
|
||||
```python
|
||||
import numpy as np
|
||||
|
||||
transform = MyFlip(direction='horizontal')
|
||||
data_dict = {'img': np.random.rand(224, 224, 3)}
|
||||
data_dict = transform(data_dict)
|
||||
processed_img = data_dict['img']
|
||||
```
|
||||
|
||||
Or, we can use `MyFlip` transformation in data pipeline in our config file.
|
||||
|
||||
```python
|
||||
pipeline = [
|
||||
...
|
||||
dict(type='MyFlip', direction='horizontal'),
|
||||
...
|
||||
]
|
||||
```
|
||||
|
||||
Note that if you want to use `MyFlip` in config, you must ensure the file containing `MyFlip` is imported during the program run.
|
||||
|
|
Loading…
Reference in New Issue