[Docs] add advanced_guides/transforms.md (#463)

* modify benchmark docs

* add transforms.md

* modify wrappers.py

* add transforms.md

* add transforms.md

* add transforms.md

* add transforms.md

* Update transforms.md

add transforms.md

* Update transforms.md

* Update transforms.md

* Update transforms.md

* Update transforms.md

* Update transforms.md

* Update transforms.md

* Update transforms.md

* Update transforms.md

* Update transforms.md

* Update transforms.md

* Update transforms.md

* add transforms.md

* add transforms.md

* Update transforms.md

* Update transforms.md

* Update transforms.md

* update
pull/567/head
Wangbo Zhao(黑色枷锁) 2022-09-08 14:11:06 +08:00 committed by Yixiao Fang
parent 52051d5c2d
commit 3bcdcbe9f4
2 changed files with 86 additions and 1 deletions

View File

@ -1 +1,85 @@
# Transforms
- [Transforms](<>)
- [Overview of transforms](#overview-of-transforms)
- [Introduction of `MultiView`](#introduction-of-multiview)
- [Introduction of `PackSelfSupInputs`](#introduction-of-packselfsupinputs)
## Overview of transforms
We have introduced how to build a `Pipeline` in [add_transforms](./add_transforms.md). A `Pipeline` contains a series of
`transforms`. There are three main categories of `transforms` in MMSelfSup:
1. Transforms about processing the data. The unique transforms in MMSelfSup are defined in [processing.py](https://github.com/open-mmlab/mmselfsup/blob/1.x/mmselfsup/datasets/transforms/processing.py), e.g. `RandomCrop`, `RandomResizedCrop` and `RandomGaussianBlur`.
We may also use some transforms from other repositories, e.g. `LoadImageFromFile` from MMCV.
2. The transform wrapper for multiple views of an image. It is defined in [wrappers.py](https://github.com/open-mmlab/mmselfsup/blob/1.x/mmselfsup/datasets/transforms/wrappers.py).
3. The transform to pack data into a format compatible with the inputs of the algorithm. It is defined in [formatting.py](https://github.com/open-mmlab/mmselfsup/blob/1.x/mmselfsup/datasets/transforms/formatting.py).
In summary, we implement these `transforms` below. The last two transforms will be introduced in detail.
| class | function |
| :-------------------------------------------------------------------------------------------------------------: | :--------------------------------------------------------------------------------: |
| [`BEiTMaskGenerator`](mmselfsup.datasets.BEiTMaskGenerator) | Generate mask for image refers to `BEiT` |
| [`SimMIMMaskGenerator`](mmselfsup.datasets.SimMIMMaskGenerator) | Generate random block mask for each Image refers to `SimMIM` |
| [`ColorJitter`](mmselfsup.datasets.ColorJitter) | Randomly change the brightness, contrast, saturation and hue of an image |
| [`RandomCrop`](mmselfsup.datasets.RandomCrop) | Crop the given Image at a random location |
| [`RandomGaussianBlur`](mmselfsup.datasets.RandomGaussianBlur) | GaussianBlur augmentation refers to `SimCLR` |
| [`RandomResizedCrop`](mmselfsup.datasets.RandomResizedCrop) | Crop the given image to random size and aspectratio |
| [`RandomResizedCropAndInterpolationWithTwoPic`](mmselfsup.datasets.RandomResizedCropAndInterpolationWithTwoPic) | Crop the given PIL Image to random size and aspect ratio with random interpolation |
| [`RandomSolarize`](mmselfsup.datasets.RandomSolarize) | Solarization augmentation refers to `BYOL` |
| [`RotationWithLabels`](mmselfsup.datasets.RotationWithLabels) | Rotation prediction |
| [`RandomPatchWithLabels`](mmselfsup.datasets.RandomPatchWithLabels) | Apply random patch augmentation to the given image |
| [`RandomRotation`](mmselfsup.datasets.RandomRotation) | Rotate the image by angle |
| [`MultiView`](mmselfsup.datasets.transforms.MultiView) | A wrapper for algorithms with multi-view image inputs |
| [`PackSelfSupInputs`](mmselfsup.datasets.PackSelfSupInputs) | Pack data into a format compatible with the inputs of an algorithm |
## Introduction of `MultiView`
We build a wrapper named [`MultiView`](mmselfsup.datasets.transforms.MultiView) for some algorithms e.g. MOCO, SimCLR and SwAV with multi-view image inputs. In the config file, we can
define it as:
```python
pipeline = [
dict(type='MultiView',
num_views=2,
transforms=[
[dict(type='Resize', scale=224),]
])
]
```
, which means that there are two views in the pipeline.
We can also define pipeline with different views like:
```python
pipeline = [
dict(type='MultiView',
num_views=[2, 6],
transforms=[
[
dict(type='Resize', scale=224)],
[
dict(type='Resize', scale=224),
dict(type='RandomSolarize')],
])
]
```
This means that there are two pipelines, which contain 2 views and 6 views, respectively.
More examples can be found in [imagenet_mocov1.py](https://github.com/open-mmlab/mmselfsup/blob/1.x/configs/selfsup/_base_/datasets/imagenet_mocov1.py), [imagenet_mocov2.py](https://github.com/open-mmlab/mmselfsup/blob/1.x/configs/selfsup/_base_/datasets/imagenet_mocov2.py) and [imagenet_swav_mcrop-2-6.py](https://github.com/open-mmlab/mmselfsup/blob/1.x/configs/selfsup/_base_/datasets/imagenet_swav_mcrop-2-6.py) etc.
## Introduction of `PackSelfSupInputs`
We build a class named [`PackSelfSupInputs`](mmselfsup.datasets.transforms.PackSelfSupInputs) to pack data into a format compatible with the inputs of an algorithm. This transform
is usually put at the end of the pipeline like:
```python
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='MultiView', num_views=2, transforms=[view_pipeline]),
dict(type='PackSelfSupInputs', meta_keys=['img_path'])
]
```

View File

@ -37,7 +37,8 @@ class MultiView(BaseTransform):
>>> dict(type='Resize', scale=224))],
>>> ])
>>> ]
>>> # Example 2: MultiViews 2 pipeline with 6 views
>>> # Example 2: MultiViews 2 pipelines, the first with 2 views,
>>> # the second with 6 views
>>> pipeline = [
>>> dict(type='MultiView',
>>> num_views=[2, 6],