MMSelfSup supports multiple datasets. Please follow the corresponding guidelines for data preparation. It is recommended to symlink your dataset root to `$MMSELFSUP/data`. If your folder structure is different, you may need to change the corresponding paths in config files.
For ImageNet, it has multiple versions, but the most commonly used one is [ILSVRC 2012](http://www.image-net.org/challenges/LSVRC/2012/). It can be accessed with the following steps:
1. Register an account and login to the [download page](http://www.image-net.org/download-images)
2. Find download links for ILSVRC2012 and download the following two files
Assuming that you usually store datasets in `$YOUR_DATA_ROOT`. The following command will automatically download PASCAL VOC 2007 into `$YOUR_DATA_ROOT`, prepare the required files, create a folder `data` under `$MMSELFSUP` and make a symlink `VOCdevkit`.
`MMSelfSup` uses [`CIFAR10`](https://github.com/open-mmlab/mmclassification/blob/1.x/mmcls/datasets/cifar.py) implemented by `MMClassification`. In addition, `MMClassification` supports automatic download of the `CIFAR10` dataset, you just need to specify the download folder in the `data_root` field. And specify `test_mode=False` / `test_mode=True` to use the training or test dataset. For more details, please refer to [docs](https://github.com/open-mmlab/mmclassification/blob/1.x/docs/en/user_guides/dataset_prepare.md#cifar) in `MMClassification`.
To prepare COCO, VOC2007 and VOC2012 for detection, you can refer to [mmdetection](https://github.com/open-mmlab/mmdetection/blob/dev-3.x/docs/en/1_exist_data_model.md).
To prepare VOC2012AUG and Cityscapes for segmentation, you can refer to [mmsegmentation](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/docs/en/user_guides/2_dataset_prepare.md#prepare-datasets)