mirror of https://github.com/alibaba/EasyCV.git
72 lines
2.3 KiB
Markdown
72 lines
2.3 KiB
Markdown
|
# Prepare Datasets
|
|||
|
|
|||
|
- [Prepare Cifar](#Prepare Cifar)
|
|||
|
- [Prepare Imagenet](#Prepare Imagenet)
|
|||
|
- [Prepare Imagenet-TFrecords](#Prepare Imagenet-TFrecords)
|
|||
|
- [Prepare COCO](#Prepare COCO)
|
|||
|
- [Prepare PAI-Itag detection](#Prepare PAI-Itag detection)
|
|||
|
|
|||
|
## Prepare Cifar
|
|||
|
|
|||
|
Download dataset [cifar10](http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/data/cifar10/cifar-10-python.tar.gz) and uncompress files to `data/cifar`, directory structure is as follows:
|
|||
|
|
|||
|
```text
|
|||
|
data/cifar
|
|||
|
└── cifar-10-batches-py
|
|||
|
├── batches.meta
|
|||
|
├── data_batch_1
|
|||
|
├── data_batch_2
|
|||
|
├── data_batch_3
|
|||
|
├── data_batch_4
|
|||
|
├── data_batch_5
|
|||
|
├── readme.html
|
|||
|
├── read.py
|
|||
|
└── test_batch
|
|||
|
```
|
|||
|
|
|||
|
## Prepare Imagenet
|
|||
|
|
|||
|
1. Go to the [download-url](http://www.image-net.org/download-images), Register an account and log in .
|
|||
|
2. Download the following files:
|
|||
|
|
|||
|
- Training images (Task 1 & 2). 138GB.
|
|||
|
|
|||
|
- Validation images (all tasks). 6.3GB.
|
|||
|
3. Unzip the downloaded file.
|
|||
|
4. Using this [scrip](https://github.com/BVLC/caffe/blob/master/data/ilsvrc12/get_ilsvrc_aux.sh) to get data meta.
|
|||
|
|
|||
|
## Prepare Imagenet-TFrecords
|
|||
|
|
|||
|
1. Go to the [download-url](https://www.kaggle.com/hmendonca/imagenet-1k-tfrecords-ilsvrc2012-part-0), Register an account and log in .
|
|||
|
2. The dataset is divided into two parts, [part0](https://www.kaggle.com/hmendonca/imagenet-1k-tfrecords-ilsvrc2012-part-0) (79GB) and [part1](https://www.kaggle.com/hmendonca/imagenet-1k-tfrecords-ilsvrc2012-part-1) (75GB), you need download all of them.
|
|||
|
|
|||
|
## Prepare COCO
|
|||
|
|
|||
|
Download [COCO2017](https://cocodataset.org/#download) dataset to `data/coco`, directory structure is as follows
|
|||
|
|
|||
|
```text
|
|||
|
data/coco
|
|||
|
├── annotations
|
|||
|
├── train2017
|
|||
|
└── val2017
|
|||
|
```
|
|||
|
|
|||
|
## Prepare PAI-Itag detection
|
|||
|
|
|||
|
Download [SmallCOCO](http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/unittest/data/detection/small_coco_itag/small_coco_itag.tar.gz) dataset to `data/coco`,
|
|||
|
directory structure is as follows:
|
|||
|
|
|||
|
```text
|
|||
|
data/coco/
|
|||
|
├── train2017
|
|||
|
├── train2017_20_local.manifest
|
|||
|
├── val2017
|
|||
|
└── val2017_20_local.manifest
|
|||
|
```
|
|||
|
|
|||
|
replace train_data and val_data path in config file
|
|||
|
```shell
|
|||
|
sed -i 's#train2017.manifest#train2017_20_local.manifest#g' configs/detection/yolox_coco_pai.py
|
|||
|
sed -i 's#val2017.manifest#val2017_20_local.manifest#g' configs/detection/yolox_coco_pai.py
|
|||
|
```
|