mirror of
https://github.com/open-mmlab/mmselfsup.git
synced 2025-06-03 14:59:38 +08:00
* update requirement * update readme * update readme * update link * update version * update zh_cn readme * update links * update get started * update dockerfile * fix lint * add mmengine * refine * fix typo * refine * fix lint * update * update * fix lint * update version * fix lint * fix indent Co-authored-by: Jiahao Xie <52497952+Jiahao000@users.noreply.github.com>
102 lines
4.4 KiB
Markdown
102 lines
4.4 KiB
Markdown
# Data Flow
|
|
|
|
- [Data Flow](#data-flow)
|
|
- [Data flow between dataloader and model](#data-flow-between-dataloader-and-model)
|
|
- [Data from dataset](#data-from-dataset)
|
|
- [Data from dataloader](#data-from-dataloader)
|
|
- [Data from data preprocessor](#data-from-data-preprocessor)
|
|
|
|
Data flow defines how data should be passed between two isolated modules, e.g. dataloader and model, as shown below.
|
|
|
|
<div align="left">
|
|
<img src="https://user-images.githubusercontent.com/30762564/185855134-89f5be9e-39ca-4da4-bd87-7cf26e80ab2f.png" width="70%"/>
|
|
</div>
|
|
|
|
In MMSelfSup, we mainly focus on the data flow between dataloader and model, and between model and visualizer. As for the
|
|
data flow between model and metric, please refer to the docs in other repos, e.g. [MMClassification](https://github.com/open-mmlab/mmclassification).
|
|
Also for data flow between model and visualizer, you can refer to [visualization](../user_guides/visualization.md)
|
|
|
|
## Data flow between dataloader and model
|
|
|
|
The data flow between dataloader and model can be generally split into three parts, i) use `PackSelfSupInputs` to pack
|
|
data from previous transformations into a dictionary, ii) use `collate_fn` to stack a list of tensors into a batched tensor,
|
|
iii) data preprocessor will move all these data to target device, e.g. GPUS, and unzip the dictionary from the dataloader
|
|
into a tuple, containing the input images and meta info (`SelfSupDataSample`).
|
|
|
|
### Data from dataset
|
|
|
|
In MMSelfSup, before feeding into the model, data should go through a series of transformations, called `pipeline`, e.g. `RandomResizedCrop` and `ColorJitter`. No matter how many transformations in the pipeline, the last transformation is `PackSelfSupInputs`. `PackSelfSupInputs` will
|
|
pack these data from previous transformations into a dictionary. The dictionary contains two parts, namely, `inputs` and `data_samples`.
|
|
|
|
```python
|
|
|
|
# We omit some unimportant code here
|
|
|
|
class PackSelfSupInputs(BaseTransform):
|
|
|
|
def transform(self,
|
|
results: Dict) -> Dict[torch.Tensor, SelfSupDataSample]:
|
|
|
|
packed_results = dict()
|
|
if self.key in results:
|
|
...
|
|
packed_results['inputs'] = img
|
|
|
|
...
|
|
packed_results['data_samples'] = data_sample
|
|
|
|
return packed_results
|
|
```
|
|
|
|
Note: `inputs` contains a list of images, e.g. the multi-views in contrastive learning. Even a single view,
|
|
`PackSelfSupInputs` will still put it into a list.
|
|
|
|
### Data from dataloader
|
|
|
|
After receiving a list of dictionary from dataset, `collect_fn` in dataloader will gather `inputs` in each dict
|
|
and stack them into a batched tensor. In addition, `data_sample` in each dict will be also collected in a list.
|
|
Then, it will output a dict, containing the same keys with those of the dict in the received list. Finally, dataloader
|
|
will output the dict from the `collect_fn`.
|
|
|
|
### Data from data preprocessor
|
|
|
|
Data preprocessor is the last step to process the data before feeding into the model. It will apply image normalization, convert BGR to RGB
|
|
and move all data to the target device, e.g. GPUs. After above steps, it will output a tuple, containing a list of batched images, and a list
|
|
of data samples.
|
|
|
|
```python
|
|
class SelfSupDataPreprocessor(ImgDataPreprocessor):
|
|
|
|
def forward(
|
|
self,
|
|
data: dict,
|
|
training: bool = False
|
|
) -> Tuple[List[torch.Tensor], Optional[list]]:
|
|
|
|
assert isinstance(data,
|
|
dict), 'Please use default_collate in dataloader, \
|
|
instead of pseudo_collate.'
|
|
|
|
data = [val for _, val in data.items()]
|
|
batch_inputs, batch_data_samples = self.cast_data(data)
|
|
# channel transform
|
|
if self._channel_conversion:
|
|
batch_inputs = [
|
|
_input[:, [2, 1, 0], ...] for _input in batch_inputs
|
|
]
|
|
|
|
# Convert to float after channel conversion to ensure
|
|
# efficiency
|
|
batch_inputs = [input_.float() for input_ in batch_inputs]
|
|
|
|
# Normalization. Here is what is different from
|
|
# :class:`mmengine.ImgDataPreprocessor`. Since there are multiple views
|
|
# for an image for some algorithms, e.g. SimCLR, each item in inputs
|
|
# is a list, containing multi-views for an image.
|
|
if self._enable_normalize:
|
|
batch_inputs = [(_input - self.mean) / self.std
|
|
for _input in batch_inputs]
|
|
|
|
return batch_inputs, batch_data_samples
|
|
```
|