# Migrate Data Transform to OpenMMLab 2.0 ## Introduction According to the data transform interface convention of TorchVision, all data transform classes need to implement the `__call__` method. And in the convention of OpenMMLab 1.0, we require the input and output of the `__call__` method should be a dictionary. In OpenMMLab 2.0, to make the data transform classes more extensible, we use `transform` method instead of `__call__` method to implement data transformation, and all data transform classes should inherit the [`mmcv.transforms.BaseTransfrom`](mmcv.transforms.BaseTransfrom) class. And you can still use these data transform classes by calling. A tutorial to implement a data transform class can be found in the [Data Transform](../advanced_tutorials/data_element.md). In addition, we move some common data transform classes from every repositories to MMCV, and in this document, we will compare the functionalities, usages and implementations between the original data transform classes (in [MMClassification v0.23.2](https://github.com/open-mmlab/mmclassification/tree/v0.23.2), [MMDetection v2.25.1](https://github.com/open-mmlab/mmdetection/tree/v2.25.1)) and the new data transform classes (in [MMCV v2.0.0rc1](https://github.com/open-mmlab/mmcv/tree/2.x)) ## Functionality Differences

	MMClassification (original)	MMDetection (original)	MMCV (new)
`LoadImageFromFile`	Join the 'img_prefix' and 'img_info.filename' field to find the path of images and loading.	Join the 'img_prefix' and 'img_info.filename' field to find the path of images and loading. Support specifying the order of channels.	Load images from 'img_path'. Support ignoring failed loading and specifying decode backend.
`LoadAnnotations`	Not available.	Load bbox, label, mask (include polygon masks), semantic segmentation. Support converting bbox coordinate system.	Load bbox, label, mask (not include polygon masks), semantic segmentation.
`Pad`	Pad all images in the "img_fields" field.	Pad all images in the "img_fields" field. Support padding to integer multiple size.	Pad the image in the "img" field. Support padding to integer multiple size.
`CenterCrop`	Crop all images in the "img_fields" field. Support cropping as EfficientNet style.	Not available.	Crop the image in the "img" field, the bbox in the "gt_bboxes" field, the semantic segmentation in the "gt_seg_map" field, the keypoints in the "gt_keypoints" field. Support padding the margin of the cropped image.
`Normalize`	Normalize the image.	No differences.	No differences, but we recommend to use data preprocessor to normalize the image.
`Resize`	Resize all images in the "img_fields" field. Support resizing proportionally according to the specified edge.	Use `Resize` with `ratio_range=None`, the `img_scale` have a single scale, and `multiscale_mode="value"`.	Resize the image in the "img" field, the bbox in the "gt_bboxes" field, the semantic segmentation in the "gt_seg_map" field, the keypoints in the "gt_keypoints" field. Support specifying the ratio of new scale to original scale and support resizing proportionally.
`RandomResize`	Not available	Use `Resize` with `ratio_range=None`, `img_scale` have two scales and `multiscale_mode="range"`, or `ratio_range` is not None. Resize( img_sacle=[(640, 480), (960, 720)], mode="range", )	Have the same resize function as `Resize`. Support sampling the scale from a scale range or scale ratio range. RandomResize(scale=[(640, 480), (960, 720)])
`RandomChoiceResize`	Not available	Use `Resize` with `ratio_range=None`, `img_scale` have multiple scales, and `multiscale_mode="value"`. Resize( img_sacle=[(640, 480), (960, 720)], mode="value", )	Have the same resize function as `Resize`. Support randomly choosing the scale from multiple scales or multiple scale ratios. RandomChoiceResize(scales=[(640, 480), (960, 720)])
`RandomGrayscale`	Randomly grayscale all images in the "img_fields" field. Support keeping channels after grayscale.	Not available	Randomly grayscale the image in the "img" field. Support specifying the weight of each channel, and support keeping channels after grayscale.
`RandomFlip`	Randomly flip all images in the "img_fields" field. Support flipping horizontally and vertically.	Randomly flip all values in the "img_fields", "bbox_fields", "mask_fields" and "seg_fields". Support flipping horizontally, vertically and diagonally, and support specifying the probability of every kind of flipping.	Randomly flip the values in the "img", "gt_bboxes", "gt_seg_map", "gt_keypoints" field. Support flipping horizontally, vertically and diagonally, and support specifying the probability of every kind of flipping.
`MultiScaleFlipAug`	Not available	Used for test-time-augmentation.	Use `TestTimeAug`
`ToTensor`	Convert the values in the specified fields to `torch.Tensor`.	No differences	No differences
`ImageToTensor`	Convert the values in the specified fields to `torch.Tensor` and transpose the channels to CHW.	No differences.	No differences.

## Implementation Differences Take `RandomFlip` as example, the new version [RandomFlip](<>) in MMCV inherits `BaseTransfrom`, and move the functionality implementation from `__call__` to `transform` method. In addition, the randomness related code is placed in some extra methods and these methods need to be wrapped by `cache_randomness` decorator. - MMDetection (original version) ```python class RandomFlip: def __call__(self, results): """Randomly flip images.""" ... # Randomly choose the flip direction cur_dir = np.random.choice(direction_list, p=flip_ratio_list) ... return results ``` - MMCV (new version) ```python class RandomFlip(BaseTransfrom): def transform(self, results): """Randomly flip images""" ... cur_dir = self._random_direction() ... return results @cache_randomness def _random_direction(self): """Randomly choose the flip direction""" ... return np.random.choice(direction_list, p=flip_ratio_list) ```