# Migrate Data Transform to OpenMMLab 2.0 ## Introduction According to the data transform interface convention of TorchVision, all data transform classes need to implement the `__call__` method. And in the convention of OpenMMLab 1.0, we require the input and output of the `__call__` method should be a dictionary. In OpenMMLab 2.0, to make the data transform classes more extensible, we use `transform` method instead of `__call__` method to implement data transformation, and all data transform classes should inherit the [`mmcv.transforms.BaseTransfrom`](mmcv.transforms.BaseTransfrom) class. And you can still use these data transform classes by calling. A tutorial to implement a data transform class can be found in the [Data Transform](../advanced_tutorials/data_element.md). In addition, we move some common data transform classes from every repositories to MMCV, and in this document, we will compare the functionalities, usages and implementations between the original data transform classes (in [MMClassification v0.23.2](https://github.com/open-mmlab/mmclassification/tree/v0.23.2), [MMDetection v2.25.1](https://github.com/open-mmlab/mmdetection/tree/v2.25.1)) and the new data transform classes (in [MMCV v2.0.0rc1](https://github.com/open-mmlab/mmcv/tree/2.x)) ## Functionality Differences
MMClassification (original) | MMDetection (original) | MMCV (new) | |
---|---|---|---|
LoadImageFromFile |
Join the 'img_prefix' and 'img_info.filename' field to find the path of images and loading. | Join the 'img_prefix' and 'img_info.filename' field to find the path of images and loading. Support specifying the order of channels. | Load images from 'img_path'. Support ignoring failed loading and specifying decode backend. |
LoadAnnotations |
Not available. | Load bbox, label, mask (include polygon masks), semantic segmentation. Support converting bbox coordinate system. | Load bbox, label, mask (not include polygon masks), semantic segmentation. |
Pad |
Pad all images in the "img_fields" field. | Pad all images in the "img_fields" field. Support padding to integer multiple size. | Pad the image in the "img" field. Support padding to integer multiple size. |
CenterCrop |
Crop all images in the "img_fields" field. Support cropping as EfficientNet style. | Not available. | Crop the image in the "img" field, the bbox in the "gt_bboxes" field, the semantic segmentation in the "gt_seg_map" field, the keypoints in the "gt_keypoints" field. Support padding the margin of the cropped image. |
Normalize |
Normalize the image. | No differences. | No differences, but we recommend to use data preprocessor to normalize the image. |
Resize |
Resize all images in the "img_fields" field. Support resizing proportionally according to the specified edge. | Use Resize with ratio_range=None , the img_scale have a single scale, and multiscale_mode="value" . |
Resize the image in the "img" field, the bbox in the "gt_bboxes" field, the semantic segmentation in the "gt_seg_map" field, the keypoints in the "gt_keypoints" field. Support specifying the ratio of new scale to original scale and support resizing proportionally. |
RandomResize |
Not available | Use Resize with ratio_range=None , img_scale have two scales and multiscale_mode="range" , or ratio_range is not None.
Resize( img_sacle=[(640, 480), (960, 720)], mode="range", ) |
Have the same resize function as Resize . Support sampling the scale from a scale range or scale ratio range.
RandomResize(scale=[(640, 480), (960, 720)]) |
RandomChoiceResize |
Not available | Use Resize with ratio_range=None , img_scale have multiple scales, and multiscale_mode="value" .
Resize( img_sacle=[(640, 480), (960, 720)], mode="value", ) |
Have the same resize function as Resize . Support randomly choosing the scale from multiple scales or multiple scale ratios.
RandomChoiceResize(scales=[(640, 480), (960, 720)]) |
RandomGrayscale |
Randomly grayscale all images in the "img_fields" field. Support keeping channels after grayscale. | Not available | Randomly grayscale the image in the "img" field. Support specifying the weight of each channel, and support keeping channels after grayscale. |
RandomFlip |
Randomly flip all images in the "img_fields" field. Support flipping horizontally and vertically. | Randomly flip all values in the "img_fields", "bbox_fields", "mask_fields" and "seg_fields". Support flipping horizontally, vertically and diagonally, and support specifying the probability of every kind of flipping. | Randomly flip the values in the "img", "gt_bboxes", "gt_seg_map", "gt_keypoints" field. Support flipping horizontally, vertically and diagonally, and support specifying the probability of every kind of flipping. |
MultiScaleFlipAug |
Not available | Used for test-time-augmentation. | Use TestTimeAug |
ToTensor |
Convert the values in the specified fields to torch.Tensor . |
No differences | No differences |
ImageToTensor |
Convert the values in the specified fields to torch.Tensor and transpose the channels to CHW. |
No differences. | No differences. |