mirror of https://github.com/alibaba/EasyCV.git
72 lines
2.3 KiB
Markdown
72 lines
2.3 KiB
Markdown
# Prepare Datasets
|
||
|
||
- [Prepare Cifar](#Prepare Cifar)
|
||
- [Prepare Imagenet](#Prepare Imagenet)
|
||
- [Prepare Imagenet-TFrecords](#Prepare Imagenet-TFrecords)
|
||
- [Prepare COCO](#Prepare COCO)
|
||
- [Prepare PAI-Itag detection](#Prepare PAI-Itag detection)
|
||
|
||
## Prepare Cifar
|
||
|
||
Download dataset [cifar10](http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/data/cifar10/cifar-10-python.tar.gz) and uncompress files to `data/cifar`, directory structure is as follows:
|
||
|
||
```text
|
||
data/cifar
|
||
└── cifar-10-batches-py
|
||
├── batches.meta
|
||
├── data_batch_1
|
||
├── data_batch_2
|
||
├── data_batch_3
|
||
├── data_batch_4
|
||
├── data_batch_5
|
||
├── readme.html
|
||
├── read.py
|
||
└── test_batch
|
||
```
|
||
|
||
## Prepare Imagenet
|
||
|
||
1. Go to the [download-url](http://www.image-net.org/download-images), Register an account and log in .
|
||
2. Download the following files:
|
||
|
||
- Training images (Task 1 & 2). 138GB.
|
||
|
||
- Validation images (all tasks). 6.3GB.
|
||
3. Unzip the downloaded file.
|
||
4. Using this [scrip](https://github.com/BVLC/caffe/blob/master/data/ilsvrc12/get_ilsvrc_aux.sh) to get data meta.
|
||
|
||
## Prepare Imagenet-TFrecords
|
||
|
||
1. Go to the [download-url](https://www.kaggle.com/hmendonca/imagenet-1k-tfrecords-ilsvrc2012-part-0), Register an account and log in .
|
||
2. The dataset is divided into two parts, [part0](https://www.kaggle.com/hmendonca/imagenet-1k-tfrecords-ilsvrc2012-part-0) (79GB) and [part1](https://www.kaggle.com/hmendonca/imagenet-1k-tfrecords-ilsvrc2012-part-1) (75GB), you need download all of them.
|
||
|
||
## Prepare COCO
|
||
|
||
Download [COCO2017](https://cocodataset.org/#download) dataset to `data/coco`, directory structure is as follows
|
||
|
||
```text
|
||
data/coco
|
||
├── annotations
|
||
├── train2017
|
||
└── val2017
|
||
```
|
||
|
||
## Prepare PAI-Itag detection
|
||
|
||
Download [SmallCOCO](http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/unittest/data/detection/small_coco_itag/small_coco_itag.tar.gz) dataset to `data/coco`,
|
||
directory structure is as follows:
|
||
|
||
```text
|
||
data/coco/
|
||
├── train2017
|
||
├── train2017_20_local.manifest
|
||
├── val2017
|
||
└── val2017_20_local.manifest
|
||
```
|
||
|
||
replace train_data and val_data path in config file
|
||
```shell
|
||
sed -i 's#train2017.manifest#train2017_20_local.manifest#g' configs/detection/yolox_coco_pai.py
|
||
sed -i 's#val2017.manifest#val2017_20_local.manifest#g' configs/detection/yolox_coco_pai.py
|
||
```
|