[Docs] Add custom pipeline docs. (#1124)
* [Docs] Add custom pipeline docs. * Fix link. * Fix according to commentspull/1135/merge
parent
cccbedf22d
commit
280e916979
|
@ -1,127 +1,155 @@
|
||||||
# Customize Data Pipeline (TODO)
|
# Customize Data Pipeline
|
||||||
|
|
||||||
## Design of Data pipelines
|
## Design of Data pipelines
|
||||||
|
|
||||||
Following typical conventions, we use `Dataset` and `DataLoader` for data loading
|
In the [new dataset tutorial](./datasets.md), we know that the dataset class use the `load_data_list` method
|
||||||
with multiple workers. Indexing `Dataset` returns a dict of data items corresponding to
|
to initialize the entire dataset, and we save the information of every sample to a dict.
|
||||||
the arguments of models forward method.
|
|
||||||
|
|
||||||
The data preparation pipeline and the dataset is decomposed. Usually a dataset
|
Usually, to save memory usage, we only load image paths and labels in the `load_data_list`, and load full
|
||||||
defines how to process the annotations and a data pipeline defines all the steps to prepare a data dict.
|
image content when we use them. Moreover, we may want to do some random data augmentation during picking
|
||||||
A pipeline consists of a sequence of operations. Each operation takes a dict as input and also output a dict for the next transform.
|
samples when training. Almost all data loading, pre-processing, and formatting operations can be configured in
|
||||||
|
MMClassification by the **data pipeline**.
|
||||||
|
|
||||||
The operations are categorized into data loading, pre-processing and formatting.
|
The data pipeline means how to process the sample dict when indexing a sample from the dataset. And it
|
||||||
|
consists of a sequence of data transforms. Each data transform takes a dict as input, processes it, and outputs a
|
||||||
|
dict for the next data transform.
|
||||||
|
|
||||||
Here is an pipeline example for ResNet-50 training on ImageNet.
|
Here is a data pipeline example for ResNet-50 training on ImageNet.
|
||||||
|
|
||||||
```python
|
```python
|
||||||
img_norm_cfg = dict(
|
|
||||||
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
|
|
||||||
train_pipeline = [
|
train_pipeline = [
|
||||||
dict(type='LoadImageFromFile'),
|
dict(type='LoadImageFromFile'),
|
||||||
dict(type='RandomResizedCrop', size=224),
|
dict(type='RandomResizedCrop', scale=224),
|
||||||
dict(type='RandomFlip', flip_prob=0.5, direction='horizontal'),
|
dict(type='RandomFlip', prob=0.5, direction='horizontal'),
|
||||||
dict(type='Normalize', **img_norm_cfg),
|
dict(type='PackClsInputs'),
|
||||||
dict(type='ImageToTensor', keys=['img']),
|
|
||||||
dict(type='ToTensor', keys=['gt_label']),
|
|
||||||
dict(type='Collect', keys=['img', 'gt_label'])
|
|
||||||
]
|
|
||||||
test_pipeline = [
|
|
||||||
dict(type='LoadImageFromFile'),
|
|
||||||
dict(type='Resize', scale=256),
|
|
||||||
dict(type='CenterCrop', crop_size=224),
|
|
||||||
dict(type='Normalize', **img_norm_cfg),
|
|
||||||
dict(type='ImageToTensor', keys=['img']),
|
|
||||||
dict(type='Collect', keys=['img'])
|
|
||||||
]
|
]
|
||||||
```
|
```
|
||||||
|
|
||||||
For each operation, we list the related dict fields that are added/updated/removed.
|
All available data transforms in MMClassification can be found in the [data transforms docs](mmcls.datasets.transforms).
|
||||||
At the end of the pipeline, we use `Collect` to only retain the necessary items for forward computation.
|
|
||||||
|
|
||||||
### Data loading
|
## Modify the training/test pipeline
|
||||||
|
|
||||||
`LoadImageFromFile`
|
The data pipeline in MMClassification is pretty flexible. You can control almost every step of the data
|
||||||
|
preprocessing from the config file, but on the other hand, you may be confused facing so many options.
|
||||||
|
|
||||||
- add: img, img_shape, ori_shape
|
Here is a common practice and guidance for image classification tasks.
|
||||||
|
|
||||||
By default, `LoadImageFromFile` loads images from disk but it may lead to IO bottleneck for efficient small models.
|
### Loading
|
||||||
Various backends are supported by mmcv to accelerate this process. For example, if the training machines have setup
|
|
||||||
[memcached](https://memcached.org/), we can revise the config as follows.
|
|
||||||
|
|
||||||
```
|
At the beginning of a data pipeline, we usually need to load image data from the file path.
|
||||||
memcached_root = '/mnt/xxx/memcached_client/'
|
[`LoadImageFromFile`](mmcv.transforms.LoadImageFromFile) is commonly used to do this task.
|
||||||
|
|
||||||
|
```python
|
||||||
train_pipeline = [
|
train_pipeline = [
|
||||||
|
dict(type='LoadImageFromFile'),
|
||||||
|
...
|
||||||
|
]
|
||||||
|
```
|
||||||
|
|
||||||
|
If you want to load data from files with special formats or special locations, you can [implement a new loading
|
||||||
|
transform](#add-new-data-transforms) and add it at the beginning of the data pipeline.
|
||||||
|
|
||||||
|
### Augmentation and other processing
|
||||||
|
|
||||||
|
During training, we usually need to do data augmentation to avoid overfitting. During the test, we also need to do
|
||||||
|
some data processing like resizing and cropping. These data transforms will be placed after the loading process.
|
||||||
|
|
||||||
|
Here is a simple data augmentation recipe example. It will randomly resize and crop the input image to the
|
||||||
|
specified scale, and randomly flip the image horizontally with probability.
|
||||||
|
|
||||||
|
```python
|
||||||
|
train_pipeline = [
|
||||||
|
...
|
||||||
|
dict(type='RandomResizedCrop', scale=224),
|
||||||
|
dict(type='RandomFlip', prob=0.5, direction='horizontal'),
|
||||||
|
...
|
||||||
|
]
|
||||||
|
```
|
||||||
|
|
||||||
|
Here is a heavy data augmentation recipe example used in [Swin-Transformer](../papers/swin_transformer.md)
|
||||||
|
training. To align with the official implementation, it specified `pillow` as the resize backend and `bicubic`
|
||||||
|
as the resize algorithm. Moreover, it added [`RandAugment`](mmcls.datasets.transforms.RandAugment) and
|
||||||
|
[`RandomErasing`](mmcls.datasets.transforms.RandomErasing) as extra data augmentation method.
|
||||||
|
|
||||||
|
This configuration specified every detail of the data augmentation, and you can simply copy it to your own
|
||||||
|
config file to apply the data augmentations of the Swin-Transformer.
|
||||||
|
|
||||||
|
```python
|
||||||
|
bgr_mean = [103.53, 116.28, 123.675]
|
||||||
|
bgr_std = [57.375, 57.12, 58.395]
|
||||||
|
|
||||||
|
train_pipeline = [
|
||||||
|
...
|
||||||
|
dict(type='RandomResizedCrop', scale=224, backend='pillow', interpolation='bicubic'),
|
||||||
|
dict(type='RandomFlip', prob=0.5, direction='horizontal'),
|
||||||
dict(
|
dict(
|
||||||
type='LoadImageFromFile',
|
type='RandAugment',
|
||||||
file_client_args=dict(
|
policies='timm_increasing',
|
||||||
backend='memcached',
|
num_policies=2,
|
||||||
server_list_cfg=osp.join(memcached_root, 'server_list.conf'),
|
total_level=10,
|
||||||
client_cfg=osp.join(memcached_root, 'client.conf'))),
|
magnitude_level=9,
|
||||||
|
magnitude_std=0.5,
|
||||||
|
hparams=dict(
|
||||||
|
pad_val=[round(x) for x in bgr_mean], interpolation='bicubic')),
|
||||||
|
dict(
|
||||||
|
type='RandomErasing',
|
||||||
|
erase_prob=0.25,
|
||||||
|
mode='rand',
|
||||||
|
min_area_ratio=0.02,
|
||||||
|
max_area_ratio=1 / 3,
|
||||||
|
fill_color=bgr_mean,
|
||||||
|
fill_std=bgr_std),
|
||||||
|
...
|
||||||
]
|
]
|
||||||
```
|
```
|
||||||
|
|
||||||
More supported backends can be found in [mmcv.fileio.FileClient](https://github.com/open-mmlab/mmcv/blob/master/mmcv/fileio/file_client.py).
|
```{note}
|
||||||
|
Usually, the data augmentation part in the data pipeline handles only image-wise transforms, but not transforms
|
||||||
### Pre-processing
|
like image normalization or mixup/cutmix. It's because we can do image normalization and mixup/cutmix on batch data
|
||||||
|
to accelerate. To configure image normalization and mixup/cutmix, please use the [data preprocessor]
|
||||||
`Resize`
|
(mmcls.models.utils.data_preprocessor).
|
||||||
|
```
|
||||||
- add: scale, scale_idx, pad_shape, scale_factor, keep_ratio
|
|
||||||
- update: img, img_shape
|
|
||||||
|
|
||||||
`RandomFlip`
|
|
||||||
|
|
||||||
- add: flip, flip_direction
|
|
||||||
- update: img
|
|
||||||
|
|
||||||
`RandomCrop`
|
|
||||||
|
|
||||||
- update: img, pad_shape
|
|
||||||
|
|
||||||
`Normalize`
|
|
||||||
|
|
||||||
- add: img_norm_cfg
|
|
||||||
- update: img
|
|
||||||
|
|
||||||
### Formatting
|
### Formatting
|
||||||
|
|
||||||
`ToTensor`
|
The formatting is to collect training data from the data information dict and convert these data to
|
||||||
|
model-friendly format.
|
||||||
|
|
||||||
- update: specified by `keys`.
|
In most cases, you can simply use [`PackClsInputs`](mmcls.datasets.transforms.PackClsInputs), and it will
|
||||||
|
convert the image in NumPy array format to PyTorch tensor, and pack the ground truth categories information and
|
||||||
|
other meta information as a [`ClsDataSample`](mmcls.structures.ClsDataSample).
|
||||||
|
|
||||||
`ImageToTensor`
|
```python
|
||||||
|
train_pipeline = [
|
||||||
|
...
|
||||||
|
dict(type='PackClsInputs'),
|
||||||
|
]
|
||||||
|
```
|
||||||
|
|
||||||
- update: specified by `keys`.
|
## Add new data transforms
|
||||||
|
|
||||||
`Collect`
|
1. Write a new data transform in any file, e.g., `my_transform.py`, and place it in
|
||||||
|
the folder `mmcls/datasets/transforms/`. The data transform class needs to inherit
|
||||||
- remove: all other keys except for those specified by `keys`
|
the [`mmcv.transforms.BaseTransform`](mmcv.transforms.BaseTransform) class and override
|
||||||
|
the `transform` method which takes a dict as input and returns a dict.
|
||||||
For more information about other data transformation classes, please refer to [Data Transforms](mmcls.datasets.transforms)
|
|
||||||
|
|
||||||
## Extend and use custom pipelines
|
|
||||||
|
|
||||||
1. Write a new pipeline in any file, e.g., `my_pipeline.py`, and place it in
|
|
||||||
the folder `mmcls/datasets/pipelines/`. The pipeline class needs to override
|
|
||||||
the `__call__` method which takes a dict as input and returns a dict.
|
|
||||||
|
|
||||||
```python
|
```python
|
||||||
from mmcls.datasets import PIPELINES
|
from mmcv.transforms import BaseTransform
|
||||||
|
from mmcls.datasets import TRANSFORMS
|
||||||
|
|
||||||
@PIPELINES.register_module()
|
@TRANSFORMS.register_module()
|
||||||
class MyTransform(object):
|
class MyTransform(BaseTransform):
|
||||||
|
|
||||||
def __call__(self, results):
|
def transform(self, results):
|
||||||
# apply transforms on results['img']
|
# Modify the data information dict `results`.
|
||||||
return results
|
return results
|
||||||
```
|
```
|
||||||
|
|
||||||
2. Import the new class in `mmcls/datasets/pipelines/__init__.py`.
|
2. Import the new class in the `mmcls/datasets/transforms/__init__.py`.
|
||||||
|
|
||||||
```python
|
```python
|
||||||
...
|
...
|
||||||
from .my_pipeline import MyTransform
|
from .my_transform import MyTransform
|
||||||
|
|
||||||
__all__ = [
|
__all__ = [
|
||||||
..., 'MyTransform'
|
..., 'MyTransform'
|
||||||
|
@ -131,17 +159,10 @@ For more information about other data transformation classes, please refer to [D
|
||||||
3. Use it in config files.
|
3. Use it in config files.
|
||||||
|
|
||||||
```python
|
```python
|
||||||
img_norm_cfg = dict(
|
|
||||||
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
|
|
||||||
train_pipeline = [
|
train_pipeline = [
|
||||||
dict(type='LoadImageFromFile'),
|
...
|
||||||
dict(type='RandomResizedCrop', size=224),
|
|
||||||
dict(type='RandomFlip', flip_prob=0.5, direction='horizontal'),
|
|
||||||
dict(type='MyTransform'),
|
dict(type='MyTransform'),
|
||||||
dict(type='Normalize', **img_norm_cfg),
|
...
|
||||||
dict(type='ImageToTensor', keys=['img']),
|
|
||||||
dict(type='ToTensor', keys=['gt_label']),
|
|
||||||
dict(type='Collect', keys=['img', 'gt_label'])
|
|
||||||
]
|
]
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
|
@ -1,148 +1,4 @@
|
||||||
# 自定义数据处理流程(待更新)
|
# 自定义数据处理流程(待更新)
|
||||||
|
|
||||||
## 设计数据流水线
|
请参见[英文文档](https://mmclassification.readthedocs.io/en/dev-1.x/advanced_guides/pipeline.html),如果你有兴
|
||||||
|
趣参与中文文档的翻译,欢迎在 [讨论区](https://github.com/open-mmlab/mmclassification/discussions/1027)进行报名。
|
||||||
按照典型的用法,我们通过 `Dataset` 和 `DataLoader` 来使用多个 worker 进行数据加
|
|
||||||
载。对 `Dataset` 的索引操作将返回一个与模型的 `forward` 方法的参数相对应的字典。
|
|
||||||
|
|
||||||
数据流水线和数据集在这里是解耦的。通常,数据集定义如何处理标注文件,而数据流水
|
|
||||||
线定义所有准备数据字典的步骤。流水线由一系列操作组成。每个操作都将一个字典作为
|
|
||||||
输入,并输出一个字典。
|
|
||||||
|
|
||||||
这些操作分为数据加载,预处理和格式化。
|
|
||||||
|
|
||||||
这里使用 ResNet-50 在 ImageNet 数据集上的数据流水线作为示例。
|
|
||||||
|
|
||||||
```python
|
|
||||||
img_norm_cfg = dict(
|
|
||||||
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
|
|
||||||
train_pipeline = [
|
|
||||||
dict(type='LoadImageFromFile'),
|
|
||||||
dict(type='RandomResizedCrop', size=224),
|
|
||||||
dict(type='RandomFlip', flip_prob=0.5, direction='horizontal'),
|
|
||||||
dict(type='Normalize', **img_norm_cfg),
|
|
||||||
dict(type='ImageToTensor', keys=['img']),
|
|
||||||
dict(type='ToTensor', keys=['gt_label']),
|
|
||||||
dict(type='Collect', keys=['img', 'gt_label'])
|
|
||||||
]
|
|
||||||
test_pipeline = [
|
|
||||||
dict(type='LoadImageFromFile'),
|
|
||||||
dict(type='Resize', scale=256),
|
|
||||||
dict(type='CenterCrop', crop_size=224),
|
|
||||||
dict(type='Normalize', **img_norm_cfg),
|
|
||||||
dict(type='ImageToTensor', keys=['img']),
|
|
||||||
dict(type='Collect', keys=['img'])
|
|
||||||
]
|
|
||||||
```
|
|
||||||
|
|
||||||
对于每个操作,我们列出了添加、更新、删除的相关字典字段。在流水线的最后,我们使
|
|
||||||
用 `Collect` 仅保留进行模型 `forward` 方法所需的项。
|
|
||||||
|
|
||||||
### 数据加载
|
|
||||||
|
|
||||||
`LoadImageFromFile` - 从文件中加载图像
|
|
||||||
|
|
||||||
- 添加:img, img_shape, ori_shape
|
|
||||||
|
|
||||||
默认情况下,`LoadImageFromFile` 将会直接从硬盘加载图像,但对于一些效率较高、规
|
|
||||||
模较小的模型,这可能会导致 IO 瓶颈。MMCV 支持多种数据加载后端来加速这一过程。例
|
|
||||||
如,如果训练设备上配置了 [memcached](https://memcached.org/),那么我们按照如下
|
|
||||||
方式修改配置文件。
|
|
||||||
|
|
||||||
```
|
|
||||||
memcached_root = '/mnt/xxx/memcached_client/'
|
|
||||||
train_pipeline = [
|
|
||||||
dict(
|
|
||||||
type='LoadImageFromFile',
|
|
||||||
file_client_args=dict(
|
|
||||||
backend='memcached',
|
|
||||||
server_list_cfg=osp.join(memcached_root, 'server_list.conf'),
|
|
||||||
client_cfg=osp.join(memcached_root, 'client.conf'))),
|
|
||||||
]
|
|
||||||
```
|
|
||||||
|
|
||||||
更多支持的数据加载后端,可以参见 [mmcv.fileio.FileClient](https://github.com/open-mmlab/mmcv/blob/master/mmcv/fileio/file_client.py)。
|
|
||||||
|
|
||||||
### 预处理
|
|
||||||
|
|
||||||
`Resize` - 缩放图像尺寸
|
|
||||||
|
|
||||||
- 添加:scale, scale_idx, pad_shape, scale_factor, keep_ratio
|
|
||||||
- 更新:img, img_shape
|
|
||||||
|
|
||||||
`RandomFlip` - 随机翻转图像
|
|
||||||
|
|
||||||
- 添加:flip, flip_direction
|
|
||||||
- 更新:img
|
|
||||||
|
|
||||||
`RandomCrop` - 随机裁剪图像
|
|
||||||
|
|
||||||
- 更新:img, pad_shape
|
|
||||||
|
|
||||||
`Normalize` - 图像数据归一化
|
|
||||||
|
|
||||||
- 添加:img_norm_cfg
|
|
||||||
- 更新:img
|
|
||||||
|
|
||||||
### 格式化
|
|
||||||
|
|
||||||
`ToTensor` - 转换(标签)数据至 `torch.Tensor`
|
|
||||||
|
|
||||||
- 更新:根据参数 `keys` 指定
|
|
||||||
|
|
||||||
`ImageToTensor` - 转换图像数据至 `torch.Tensor`
|
|
||||||
|
|
||||||
- 更新:根据参数 `keys` 指定
|
|
||||||
|
|
||||||
`Collect` - 保留指定键值
|
|
||||||
|
|
||||||
- 删除:除了参数 `keys` 指定以外的所有键值对
|
|
||||||
|
|
||||||
## 扩展及使用自定义流水线
|
|
||||||
|
|
||||||
1. 编写一个新的数据处理操作,并放置在 `mmcls/datasets/pipelines/` 目录下的任何
|
|
||||||
一个文件中,例如 `my_pipeline.py`。这个类需要重载 `__call__` 方法,接受一个
|
|
||||||
字典作为输入,并返回一个字典。
|
|
||||||
|
|
||||||
```python
|
|
||||||
from mmcls.datasets import PIPELINES
|
|
||||||
|
|
||||||
@PIPELINES.register_module()
|
|
||||||
class MyTransform(object):
|
|
||||||
|
|
||||||
def __call__(self, results):
|
|
||||||
# 对 results['img'] 进行变换操作
|
|
||||||
return results
|
|
||||||
```
|
|
||||||
|
|
||||||
2. 在 `mmcls/datasets/pipelines/__init__.py` 中导入这个新的类。
|
|
||||||
|
|
||||||
```python
|
|
||||||
...
|
|
||||||
from .my_pipeline import MyTransform
|
|
||||||
|
|
||||||
__all__ = [
|
|
||||||
..., 'MyTransform'
|
|
||||||
]
|
|
||||||
```
|
|
||||||
|
|
||||||
3. 在数据流水线的配置中添加这一操作。
|
|
||||||
|
|
||||||
```python
|
|
||||||
img_norm_cfg = dict(
|
|
||||||
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
|
|
||||||
train_pipeline = [
|
|
||||||
dict(type='LoadImageFromFile'),
|
|
||||||
dict(type='RandomResizedCrop', size=224),
|
|
||||||
dict(type='RandomFlip', flip_prob=0.5, direction='horizontal'),
|
|
||||||
dict(type='MyTransform'),
|
|
||||||
dict(type='Normalize', **img_norm_cfg),
|
|
||||||
dict(type='ImageToTensor', keys=['img']),
|
|
||||||
dict(type='ToTensor', keys=['gt_label']),
|
|
||||||
dict(type='Collect', keys=['img', 'gt_label'])
|
|
||||||
]
|
|
||||||
```
|
|
||||||
|
|
||||||
## 流水线可视化
|
|
||||||
|
|
||||||
设计好数据流水线后,可以使用[可视化工具](../user_guides/visualization.md)查看具体的效果。
|
|
||||||
|
|
Loading…
Reference in New Issue