mmpretrain/docs/tutorials/data_pipeline.md

149 lines
4.3 KiB
Markdown
Raw Normal View History

# Tutorial 4: Custom Data Pipelines
2020-07-08 12:59:15 +08:00
## Design of Data pipelines
Following typical conventions, we use `Dataset` and `DataLoader` for data loading
with multiple workers. Indexing `Dataset` returns a dict of data items corresponding to
the arguments of models forward method.
2020-07-08 12:59:15 +08:00
The data preparation pipeline and the dataset is decomposed. Usually a dataset
defines how to process the annotations and a data pipeline defines all the steps to prepare a data dict.
A pipeline consists of a sequence of operations. Each operation takes a dict as input and also output a dict for the next transform.
The operations are categorized into data loading, pre-processing and formatting.
Here is an pipeline example for ResNet-50 training on ImageNet.
2020-07-08 12:59:15 +08:00
```python
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='RandomResizedCrop', size=224),
dict(type='RandomFlip', flip_prob=0.5, direction='horizontal'),
dict(type='Normalize', **img_norm_cfg),
dict(type='ImageToTensor', keys=['img']),
dict(type='ToTensor', keys=['gt_label']),
dict(type='Collect', keys=['img', 'gt_label'])
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='Resize', size=256),
dict(type='CenterCrop', crop_size=224),
dict(type='Normalize', **img_norm_cfg),
dict(type='ImageToTensor', keys=['img']),
2020-10-01 22:15:14 +08:00
dict(type='Collect', keys=['img'])
2020-07-08 12:59:15 +08:00
]
```
For each operation, we list the related dict fields that are added/updated/removed.
At the end of the pipeline, we use `Collect` to only retain the necessary items for forward computation.
### Data loading
`LoadImageFromFile`
- add: img, img_shape, ori_shape
By default, `LoadImageFromFile` loads images from disk but it may lead to IO bottleneck for efficient small models.
Various backends are supported by mmcv to accelerate this process. For example, if the training machines have setup
[memcached](https://memcached.org/), we can revise the config as follows.
```
memcached_root = '/mnt/xxx/memcached_client/'
train_pipeline = [
dict(
type='LoadImageFromFile',
file_client_args=dict(
backend='memcached',
server_list_cfg=osp.join(memcached_root, 'server_list.conf'),
client_cfg=osp.join(memcached_root, 'client.conf'))),
]
```
More supported backends can be found in [mmcv.fileio.FileClient](https://github.com/open-mmlab/mmcv/blob/master/mmcv/fileio/file_client.py).
2020-07-08 12:59:15 +08:00
### Pre-processing
`Resize`
2020-07-08 12:59:15 +08:00
- add: scale, scale_idx, pad_shape, scale_factor, keep_ratio
- update: img, img_shape
`RandomFlip`
2020-07-08 23:54:49 +08:00
- add: flip, flip_direction
2020-07-08 12:59:15 +08:00
- update: img
`RandomCrop`
2020-07-08 12:59:15 +08:00
- update: img, pad_shape
`Normalize`
2020-07-08 12:59:15 +08:00
- add: img_norm_cfg
- update: img
### Formatting
`ToTensor`
2020-07-08 12:59:15 +08:00
- update: specified by `keys`.
`ImageToTensor`
2020-07-08 12:59:15 +08:00
- update: specified by `keys`.
`Collect`
2020-07-08 12:59:15 +08:00
- remove: all other keys except for those specified by `keys`
## Extend and use custom pipelines
1. Write a new pipeline in any file, e.g., `my_pipeline.py`, and place it in
the folder `mmcls/datasets/pipelines/`. The pipeline class needs to override
the `__call__` method which takes a dict as input and returns a dict.
2020-07-08 12:59:15 +08:00
```python
from mmcls.datasets import PIPELINES
@PIPELINES.register_module()
class MyTransform(object):
def __call__(self, results):
# apply transforms on results['img']
return results
```
2. Import the new class in `mmcls/datasets/pipelines/__init__.py`.
2020-07-08 12:59:15 +08:00
```python
...
2020-07-08 12:59:15 +08:00
from .my_pipeline import MyTransform
__all__ = [
..., 'MyTransform'
]
2020-07-08 12:59:15 +08:00
```
3. Use it in config files.
```python
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='RandomResizedCrop', size=224),
dict(type='RandomFlip', flip_prob=0.5, direction='horizontal'),
dict(type='MyTransform'),
dict(type='Normalize', **img_norm_cfg),
dict(type='ImageToTensor', keys=['img']),
dict(type='ToTensor', keys=['gt_label']),
dict(type='Collect', keys=['img', 'gt_label'])
]
```
## Pipeline visualization
After designing data pipelines, you can use the [visualization tools](../tools/visualization.md) to view the performance.