[Docs] Add custom pipeline docs. (#1124)
* [Docs] Add custom pipeline docs. * Fix link. * Fix according to commentspull/1135/merge
parent
cccbedf22d
commit
280e916979
|
@ -1,127 +1,155 @@
|
|||
# Customize Data Pipeline (TODO)
|
||||
# Customize Data Pipeline
|
||||
|
||||
## Design of Data pipelines
|
||||
|
||||
Following typical conventions, we use `Dataset` and `DataLoader` for data loading
|
||||
with multiple workers. Indexing `Dataset` returns a dict of data items corresponding to
|
||||
the arguments of models forward method.
|
||||
In the [new dataset tutorial](./datasets.md), we know that the dataset class use the `load_data_list` method
|
||||
to initialize the entire dataset, and we save the information of every sample to a dict.
|
||||
|
||||
The data preparation pipeline and the dataset is decomposed. Usually a dataset
|
||||
defines how to process the annotations and a data pipeline defines all the steps to prepare a data dict.
|
||||
A pipeline consists of a sequence of operations. Each operation takes a dict as input and also output a dict for the next transform.
|
||||
Usually, to save memory usage, we only load image paths and labels in the `load_data_list`, and load full
|
||||
image content when we use them. Moreover, we may want to do some random data augmentation during picking
|
||||
samples when training. Almost all data loading, pre-processing, and formatting operations can be configured in
|
||||
MMClassification by the **data pipeline**.
|
||||
|
||||
The operations are categorized into data loading, pre-processing and formatting.
|
||||
The data pipeline means how to process the sample dict when indexing a sample from the dataset. And it
|
||||
consists of a sequence of data transforms. Each data transform takes a dict as input, processes it, and outputs a
|
||||
dict for the next data transform.
|
||||
|
||||
Here is an pipeline example for ResNet-50 training on ImageNet.
|
||||
Here is a data pipeline example for ResNet-50 training on ImageNet.
|
||||
|
||||
```python
|
||||
img_norm_cfg = dict(
|
||||
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
|
||||
train_pipeline = [
|
||||
dict(type='LoadImageFromFile'),
|
||||
dict(type='RandomResizedCrop', size=224),
|
||||
dict(type='RandomFlip', flip_prob=0.5, direction='horizontal'),
|
||||
dict(type='Normalize', **img_norm_cfg),
|
||||
dict(type='ImageToTensor', keys=['img']),
|
||||
dict(type='ToTensor', keys=['gt_label']),
|
||||
dict(type='Collect', keys=['img', 'gt_label'])
|
||||
]
|
||||
test_pipeline = [
|
||||
dict(type='LoadImageFromFile'),
|
||||
dict(type='Resize', scale=256),
|
||||
dict(type='CenterCrop', crop_size=224),
|
||||
dict(type='Normalize', **img_norm_cfg),
|
||||
dict(type='ImageToTensor', keys=['img']),
|
||||
dict(type='Collect', keys=['img'])
|
||||
dict(type='RandomResizedCrop', scale=224),
|
||||
dict(type='RandomFlip', prob=0.5, direction='horizontal'),
|
||||
dict(type='PackClsInputs'),
|
||||
]
|
||||
```
|
||||
|
||||
For each operation, we list the related dict fields that are added/updated/removed.
|
||||
At the end of the pipeline, we use `Collect` to only retain the necessary items for forward computation.
|
||||
All available data transforms in MMClassification can be found in the [data transforms docs](mmcls.datasets.transforms).
|
||||
|
||||
### Data loading
|
||||
## Modify the training/test pipeline
|
||||
|
||||
`LoadImageFromFile`
|
||||
The data pipeline in MMClassification is pretty flexible. You can control almost every step of the data
|
||||
preprocessing from the config file, but on the other hand, you may be confused facing so many options.
|
||||
|
||||
- add: img, img_shape, ori_shape
|
||||
Here is a common practice and guidance for image classification tasks.
|
||||
|
||||
By default, `LoadImageFromFile` loads images from disk but it may lead to IO bottleneck for efficient small models.
|
||||
Various backends are supported by mmcv to accelerate this process. For example, if the training machines have setup
|
||||
[memcached](https://memcached.org/), we can revise the config as follows.
|
||||
### Loading
|
||||
|
||||
```
|
||||
memcached_root = '/mnt/xxx/memcached_client/'
|
||||
At the beginning of a data pipeline, we usually need to load image data from the file path.
|
||||
[`LoadImageFromFile`](mmcv.transforms.LoadImageFromFile) is commonly used to do this task.
|
||||
|
||||
```python
|
||||
train_pipeline = [
|
||||
dict(type='LoadImageFromFile'),
|
||||
...
|
||||
]
|
||||
```
|
||||
|
||||
If you want to load data from files with special formats or special locations, you can [implement a new loading
|
||||
transform](#add-new-data-transforms) and add it at the beginning of the data pipeline.
|
||||
|
||||
### Augmentation and other processing
|
||||
|
||||
During training, we usually need to do data augmentation to avoid overfitting. During the test, we also need to do
|
||||
some data processing like resizing and cropping. These data transforms will be placed after the loading process.
|
||||
|
||||
Here is a simple data augmentation recipe example. It will randomly resize and crop the input image to the
|
||||
specified scale, and randomly flip the image horizontally with probability.
|
||||
|
||||
```python
|
||||
train_pipeline = [
|
||||
...
|
||||
dict(type='RandomResizedCrop', scale=224),
|
||||
dict(type='RandomFlip', prob=0.5, direction='horizontal'),
|
||||
...
|
||||
]
|
||||
```
|
||||
|
||||
Here is a heavy data augmentation recipe example used in [Swin-Transformer](../papers/swin_transformer.md)
|
||||
training. To align with the official implementation, it specified `pillow` as the resize backend and `bicubic`
|
||||
as the resize algorithm. Moreover, it added [`RandAugment`](mmcls.datasets.transforms.RandAugment) and
|
||||
[`RandomErasing`](mmcls.datasets.transforms.RandomErasing) as extra data augmentation method.
|
||||
|
||||
This configuration specified every detail of the data augmentation, and you can simply copy it to your own
|
||||
config file to apply the data augmentations of the Swin-Transformer.
|
||||
|
||||
```python
|
||||
bgr_mean = [103.53, 116.28, 123.675]
|
||||
bgr_std = [57.375, 57.12, 58.395]
|
||||
|
||||
train_pipeline = [
|
||||
...
|
||||
dict(type='RandomResizedCrop', scale=224, backend='pillow', interpolation='bicubic'),
|
||||
dict(type='RandomFlip', prob=0.5, direction='horizontal'),
|
||||
dict(
|
||||
type='LoadImageFromFile',
|
||||
file_client_args=dict(
|
||||
backend='memcached',
|
||||
server_list_cfg=osp.join(memcached_root, 'server_list.conf'),
|
||||
client_cfg=osp.join(memcached_root, 'client.conf'))),
|
||||
type='RandAugment',
|
||||
policies='timm_increasing',
|
||||
num_policies=2,
|
||||
total_level=10,
|
||||
magnitude_level=9,
|
||||
magnitude_std=0.5,
|
||||
hparams=dict(
|
||||
pad_val=[round(x) for x in bgr_mean], interpolation='bicubic')),
|
||||
dict(
|
||||
type='RandomErasing',
|
||||
erase_prob=0.25,
|
||||
mode='rand',
|
||||
min_area_ratio=0.02,
|
||||
max_area_ratio=1 / 3,
|
||||
fill_color=bgr_mean,
|
||||
fill_std=bgr_std),
|
||||
...
|
||||
]
|
||||
```
|
||||
|
||||
More supported backends can be found in [mmcv.fileio.FileClient](https://github.com/open-mmlab/mmcv/blob/master/mmcv/fileio/file_client.py).
|
||||
|
||||
### Pre-processing
|
||||
|
||||
`Resize`
|
||||
|
||||
- add: scale, scale_idx, pad_shape, scale_factor, keep_ratio
|
||||
- update: img, img_shape
|
||||
|
||||
`RandomFlip`
|
||||
|
||||
- add: flip, flip_direction
|
||||
- update: img
|
||||
|
||||
`RandomCrop`
|
||||
|
||||
- update: img, pad_shape
|
||||
|
||||
`Normalize`
|
||||
|
||||
- add: img_norm_cfg
|
||||
- update: img
|
||||
```{note}
|
||||
Usually, the data augmentation part in the data pipeline handles only image-wise transforms, but not transforms
|
||||
like image normalization or mixup/cutmix. It's because we can do image normalization and mixup/cutmix on batch data
|
||||
to accelerate. To configure image normalization and mixup/cutmix, please use the [data preprocessor]
|
||||
(mmcls.models.utils.data_preprocessor).
|
||||
```
|
||||
|
||||
### Formatting
|
||||
|
||||
`ToTensor`
|
||||
The formatting is to collect training data from the data information dict and convert these data to
|
||||
model-friendly format.
|
||||
|
||||
- update: specified by `keys`.
|
||||
|
||||
`ImageToTensor`
|
||||
|
||||
- update: specified by `keys`.
|
||||
|
||||
`Collect`
|
||||
|
||||
- remove: all other keys except for those specified by `keys`
|
||||
|
||||
For more information about other data transformation classes, please refer to [Data Transforms](mmcls.datasets.transforms)
|
||||
|
||||
## Extend and use custom pipelines
|
||||
|
||||
1. Write a new pipeline in any file, e.g., `my_pipeline.py`, and place it in
|
||||
the folder `mmcls/datasets/pipelines/`. The pipeline class needs to override
|
||||
the `__call__` method which takes a dict as input and returns a dict.
|
||||
In most cases, you can simply use [`PackClsInputs`](mmcls.datasets.transforms.PackClsInputs), and it will
|
||||
convert the image in NumPy array format to PyTorch tensor, and pack the ground truth categories information and
|
||||
other meta information as a [`ClsDataSample`](mmcls.structures.ClsDataSample).
|
||||
|
||||
```python
|
||||
from mmcls.datasets import PIPELINES
|
||||
train_pipeline = [
|
||||
...
|
||||
dict(type='PackClsInputs'),
|
||||
]
|
||||
```
|
||||
|
||||
@PIPELINES.register_module()
|
||||
class MyTransform(object):
|
||||
## Add new data transforms
|
||||
|
||||
def __call__(self, results):
|
||||
# apply transforms on results['img']
|
||||
1. Write a new data transform in any file, e.g., `my_transform.py`, and place it in
|
||||
the folder `mmcls/datasets/transforms/`. The data transform class needs to inherit
|
||||
the [`mmcv.transforms.BaseTransform`](mmcv.transforms.BaseTransform) class and override
|
||||
the `transform` method which takes a dict as input and returns a dict.
|
||||
|
||||
```python
|
||||
from mmcv.transforms import BaseTransform
|
||||
from mmcls.datasets import TRANSFORMS
|
||||
|
||||
@TRANSFORMS.register_module()
|
||||
class MyTransform(BaseTransform):
|
||||
|
||||
def transform(self, results):
|
||||
# Modify the data information dict `results`.
|
||||
return results
|
||||
```
|
||||
|
||||
2. Import the new class in `mmcls/datasets/pipelines/__init__.py`.
|
||||
2. Import the new class in the `mmcls/datasets/transforms/__init__.py`.
|
||||
|
||||
```python
|
||||
...
|
||||
from .my_pipeline import MyTransform
|
||||
from .my_transform import MyTransform
|
||||
|
||||
__all__ = [
|
||||
..., 'MyTransform'
|
||||
|
@ -131,17 +159,10 @@ For more information about other data transformation classes, please refer to [D
|
|||
3. Use it in config files.
|
||||
|
||||
```python
|
||||
img_norm_cfg = dict(
|
||||
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
|
||||
train_pipeline = [
|
||||
dict(type='LoadImageFromFile'),
|
||||
dict(type='RandomResizedCrop', size=224),
|
||||
dict(type='RandomFlip', flip_prob=0.5, direction='horizontal'),
|
||||
...
|
||||
dict(type='MyTransform'),
|
||||
dict(type='Normalize', **img_norm_cfg),
|
||||
dict(type='ImageToTensor', keys=['img']),
|
||||
dict(type='ToTensor', keys=['gt_label']),
|
||||
dict(type='Collect', keys=['img', 'gt_label'])
|
||||
...
|
||||
]
|
||||
```
|
||||
|
||||
|
|
|
@ -1,148 +1,4 @@
|
|||
# 自定义数据处理流程(待更新)
|
||||
|
||||
## 设计数据流水线
|
||||
|
||||
按照典型的用法,我们通过 `Dataset` 和 `DataLoader` 来使用多个 worker 进行数据加
|
||||
载。对 `Dataset` 的索引操作将返回一个与模型的 `forward` 方法的参数相对应的字典。
|
||||
|
||||
数据流水线和数据集在这里是解耦的。通常,数据集定义如何处理标注文件,而数据流水
|
||||
线定义所有准备数据字典的步骤。流水线由一系列操作组成。每个操作都将一个字典作为
|
||||
输入,并输出一个字典。
|
||||
|
||||
这些操作分为数据加载,预处理和格式化。
|
||||
|
||||
这里使用 ResNet-50 在 ImageNet 数据集上的数据流水线作为示例。
|
||||
|
||||
```python
|
||||
img_norm_cfg = dict(
|
||||
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
|
||||
train_pipeline = [
|
||||
dict(type='LoadImageFromFile'),
|
||||
dict(type='RandomResizedCrop', size=224),
|
||||
dict(type='RandomFlip', flip_prob=0.5, direction='horizontal'),
|
||||
dict(type='Normalize', **img_norm_cfg),
|
||||
dict(type='ImageToTensor', keys=['img']),
|
||||
dict(type='ToTensor', keys=['gt_label']),
|
||||
dict(type='Collect', keys=['img', 'gt_label'])
|
||||
]
|
||||
test_pipeline = [
|
||||
dict(type='LoadImageFromFile'),
|
||||
dict(type='Resize', scale=256),
|
||||
dict(type='CenterCrop', crop_size=224),
|
||||
dict(type='Normalize', **img_norm_cfg),
|
||||
dict(type='ImageToTensor', keys=['img']),
|
||||
dict(type='Collect', keys=['img'])
|
||||
]
|
||||
```
|
||||
|
||||
对于每个操作,我们列出了添加、更新、删除的相关字典字段。在流水线的最后,我们使
|
||||
用 `Collect` 仅保留进行模型 `forward` 方法所需的项。
|
||||
|
||||
### 数据加载
|
||||
|
||||
`LoadImageFromFile` - 从文件中加载图像
|
||||
|
||||
- 添加:img, img_shape, ori_shape
|
||||
|
||||
默认情况下,`LoadImageFromFile` 将会直接从硬盘加载图像,但对于一些效率较高、规
|
||||
模较小的模型,这可能会导致 IO 瓶颈。MMCV 支持多种数据加载后端来加速这一过程。例
|
||||
如,如果训练设备上配置了 [memcached](https://memcached.org/),那么我们按照如下
|
||||
方式修改配置文件。
|
||||
|
||||
```
|
||||
memcached_root = '/mnt/xxx/memcached_client/'
|
||||
train_pipeline = [
|
||||
dict(
|
||||
type='LoadImageFromFile',
|
||||
file_client_args=dict(
|
||||
backend='memcached',
|
||||
server_list_cfg=osp.join(memcached_root, 'server_list.conf'),
|
||||
client_cfg=osp.join(memcached_root, 'client.conf'))),
|
||||
]
|
||||
```
|
||||
|
||||
更多支持的数据加载后端,可以参见 [mmcv.fileio.FileClient](https://github.com/open-mmlab/mmcv/blob/master/mmcv/fileio/file_client.py)。
|
||||
|
||||
### 预处理
|
||||
|
||||
`Resize` - 缩放图像尺寸
|
||||
|
||||
- 添加:scale, scale_idx, pad_shape, scale_factor, keep_ratio
|
||||
- 更新:img, img_shape
|
||||
|
||||
`RandomFlip` - 随机翻转图像
|
||||
|
||||
- 添加:flip, flip_direction
|
||||
- 更新:img
|
||||
|
||||
`RandomCrop` - 随机裁剪图像
|
||||
|
||||
- 更新:img, pad_shape
|
||||
|
||||
`Normalize` - 图像数据归一化
|
||||
|
||||
- 添加:img_norm_cfg
|
||||
- 更新:img
|
||||
|
||||
### 格式化
|
||||
|
||||
`ToTensor` - 转换(标签)数据至 `torch.Tensor`
|
||||
|
||||
- 更新:根据参数 `keys` 指定
|
||||
|
||||
`ImageToTensor` - 转换图像数据至 `torch.Tensor`
|
||||
|
||||
- 更新:根据参数 `keys` 指定
|
||||
|
||||
`Collect` - 保留指定键值
|
||||
|
||||
- 删除:除了参数 `keys` 指定以外的所有键值对
|
||||
|
||||
## 扩展及使用自定义流水线
|
||||
|
||||
1. 编写一个新的数据处理操作,并放置在 `mmcls/datasets/pipelines/` 目录下的任何
|
||||
一个文件中,例如 `my_pipeline.py`。这个类需要重载 `__call__` 方法,接受一个
|
||||
字典作为输入,并返回一个字典。
|
||||
|
||||
```python
|
||||
from mmcls.datasets import PIPELINES
|
||||
|
||||
@PIPELINES.register_module()
|
||||
class MyTransform(object):
|
||||
|
||||
def __call__(self, results):
|
||||
# 对 results['img'] 进行变换操作
|
||||
return results
|
||||
```
|
||||
|
||||
2. 在 `mmcls/datasets/pipelines/__init__.py` 中导入这个新的类。
|
||||
|
||||
```python
|
||||
...
|
||||
from .my_pipeline import MyTransform
|
||||
|
||||
__all__ = [
|
||||
..., 'MyTransform'
|
||||
]
|
||||
```
|
||||
|
||||
3. 在数据流水线的配置中添加这一操作。
|
||||
|
||||
```python
|
||||
img_norm_cfg = dict(
|
||||
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
|
||||
train_pipeline = [
|
||||
dict(type='LoadImageFromFile'),
|
||||
dict(type='RandomResizedCrop', size=224),
|
||||
dict(type='RandomFlip', flip_prob=0.5, direction='horizontal'),
|
||||
dict(type='MyTransform'),
|
||||
dict(type='Normalize', **img_norm_cfg),
|
||||
dict(type='ImageToTensor', keys=['img']),
|
||||
dict(type='ToTensor', keys=['gt_label']),
|
||||
dict(type='Collect', keys=['img', 'gt_label'])
|
||||
]
|
||||
```
|
||||
|
||||
## 流水线可视化
|
||||
|
||||
设计好数据流水线后,可以使用[可视化工具](../user_guides/visualization.md)查看具体的效果。
|
||||
请参见[英文文档](https://mmclassification.readthedocs.io/en/dev-1.x/advanced_guides/pipeline.html),如果你有兴
|
||||
趣参与中文文档的翻译,欢迎在 [讨论区](https://github.com/open-mmlab/mmclassification/discussions/1027)进行报名。
|
||||
|
|
Loading…
Reference in New Issue