mirror of https://github.com/alibaba/EasyCV.git
fix data_hub error (#248)
parent
654554cf65
commit
281ab0b832
|
@ -16,26 +16,26 @@ Before using dataset, please read the [LICENSE](docs/source/LICENSE) file to lea
|
|||
|
||||
## Self-Supervised Learning
|
||||
|
||||
| Name | Field | Describtion | Download | Dataset API support | Mode of use | Licence |
|
||||
| ------------------------------------------------------------ | ------ | ------------------------------------------------------------ | ------------------------------------------------------------ | -------------------------|---------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------|
|
||||
| **ImageNet 1k**<br/>[url](https://image-net.org/download.php) | Common | ImageNet is an image database organized according to the [WordNet](http://wordnet.princeton.edu/) hierarchy (currently only the nouns).It is used in the ImageNet Large Scale Visual Recognition Challenge(ILSVRC) and is a benchmark for image classification. | [Baidu Netdisk (提取码:0zas)](https://pan.baidu.com/s/13pKw0bJbr-jbymQMd_YXzA)<br/>refer to [prepare_data.md](https://github.com/alibaba/EasyCV/blob/master/docs/source/prepare_data.md) | <font color=green size=5>✓</font> | ```data_source=dict(type='ClsSourceImageNet1k', root='{root path}', split='train') ``` | [LICENSE](https://github.com/alibaba/EasyCV/blob/master/docs/source/LICENSE#L1) |
|
||||
| **Imagenet-1k TFrecords**<br/>[url](https://www.kaggle.com/hmendonca/imagenet-1k-tfrecords-ilsvrc2012-part-0) | Common | Original imagenet raw images packed in TFrecord format. | [Baidu Netdisk (提取码:5zdc)](https://pan.baidu.com/s/153SY2dp02vEY9K6-O5U1UA)<br/>refer to [prepare_data.md](https://github.com/alibaba/EasyCV/blob/master/docs/source/prepare_data.md) | <font color=green size=5>✓</font> | ```data_source=dict(type='ClsSourceImageNetTFRecord', root='{root path}', download=True)``` | [LICENSE](https://github.com/alibaba/EasyCV/blob/master/docs/source/LICENSE#L1) |
|
||||
| **ImageNet 21k**<br/>[url](https://image-net.org/download.php) | Common | ImageNet-21K dataset, which is bigger and more diverse, is used less frequently for pretraining, mainly due to its complexity, low accessibility, and underestimation of its added value. | [Baidu Netdisk (提取码:kaeg)](https://pan.baidu.com/s/1eJVPCfS814cDCt3-lVHgmA)<br/>refer to [Alibaba-MIIL/ImageNet21K](https://github.com/Alibaba-MIIL/ImageNet21K/blob/main/dataset_preprocessing/processing_instructions.md) | | | [LICENSE](https://github.com/alibaba/EasyCV/blob/master/docs/source/LICENSE#L1) |
|
||||
| Name | Field | Describtion | Download | Dataset API support | Mode of use | Licence |
|
||||
| ------------------------------------------------------------ | ------ | ------------------------------------------------------------ | ------------------------------------------------------------ | -------------------------|----------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------|
|
||||
| **ImageNet 1k**<br/>[url](https://image-net.org/download.php) | Common | ImageNet is an image database organized according to the [WordNet](http://wordnet.princeton.edu/) hierarchy (currently only the nouns).It is used in the ImageNet Large Scale Visual Recognition Challenge(ILSVRC) and is a benchmark for image classification. | [Baidu Netdisk (提取码:0zas)](https://pan.baidu.com/s/13pKw0bJbr-jbymQMd_YXzA)<br/>refer to [prepare_data.md](https://github.com/alibaba/EasyCV/blob/master/docs/source/prepare_data.md) | <font color=green size=5>✓</font> | ```data_source=dict(type='ClsSourceImageNet1k', root='{root path}', split='train') ``` | [LICENSE](https://github.com/alibaba/EasyCV/blob/master/docs/source/LICENSE#L1) |
|
||||
| **Imagenet-1k TFrecords**<br/>[url](https://www.kaggle.com/hmendonca/imagenet-1k-tfrecords-ilsvrc2012-part-0) | Common | Original imagenet raw images packed in TFrecord format. | [Baidu Netdisk (提取码:5zdc)](https://pan.baidu.com/s/153SY2dp02vEY9K6-O5U1UA)<br/>refer to [prepare_data.md](https://github.com/alibaba/EasyCV/blob/master/docs/source/prepare_data.md) | <font color=green size=5>✓</font> | ```data_source=dict(type='ClsSourceImageNetTFRecord', root='{root path}', list_file={annotation file path})``` | [LICENSE](https://github.com/alibaba/EasyCV/blob/master/docs/source/LICENSE#L1) |
|
||||
| **ImageNet 21k**<br/>[url](https://image-net.org/download.php) | Common | ImageNet-21K dataset, which is bigger and more diverse, is used less frequently for pretraining, mainly due to its complexity, low accessibility, and underestimation of its added value. | [Baidu Netdisk (提取码:kaeg)](https://pan.baidu.com/s/1eJVPCfS814cDCt3-lVHgmA)<br/>refer to [Alibaba-MIIL/ImageNet21K](https://github.com/Alibaba-MIIL/ImageNet21K/blob/main/dataset_preprocessing/processing_instructions.md) | | | [LICENSE](https://github.com/alibaba/EasyCV/blob/master/docs/source/LICENSE#L1) |
|
||||
|
||||
## Classification data
|
||||
|
||||
| Name | Field | Describtion | Download | Dataset API support | Mode of use | Licence |
|
||||
|---------------------------------------------------------------------------------------------------------------| ------ | ------------------------------------------------------------ | ------------------------------------------------------------ |-------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------|
|
||||
| **Cifar10**<br/>[url](https://www.cs.toronto.edu/~kriz/cifar.html) | Common | The CIFAR-10 are labeled subsets of the [80 million tiny images](http://people.csail.mit.edu/torralba/tinyimages/) dataset. It consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images. | [cifar-10-python.tar.gz ](https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz)(163MB) | <font color=green size=5>✓</font> | <code> data_source=dict(<br/>type='ClsSourceCifar10', <br/>root='{root path}', <br/>download=True, <br/>split='train') </code> | |
|
||||
| **Cifar100**<br/>[url](https://www.cs.toronto.edu/~kriz/cifar.html) | Common | The CIFAR-100 are labeled subsets of the [80 million tiny images](http://people.csail.mit.edu/torralba/tinyimages/) dataset. It is just like the CIFAR-10, except it has 100 classes containing 600 images each. There are 500 training images and 100 testing images per class. | [cifar-100-python.tar.gz](https://www.cs.toronto.edu/~kriz/cifar-100-python.tar.gz) (161MB) | <font color=green size=5>✓</font> | <code> data_source=dict(<br/>type='ClsSourceCifar100', <br/>root='{root path}', <br/>download=True, <br/>split='train')</code> ||
|
||||
| **ImageNet 1k**<br/>[url](https://image-net.org/download.php) | Common | ImageNet is an image database organized according to the [WordNet](http://wordnet.princeton.edu/) hierarchy (currently only the nouns).It is used in the ImageNet Large Scale Visual Recognition Challenge(ILSVRC) and is a benchmark for image classification. | [Baidu Netdisk (提取码:0zas)](https://pan.baidu.com/s/13pKw0bJbr-jbymQMd_YXzA)<br/>refer to [prepare_data.md](https://github.com/alibaba/EasyCV/blob/master/docs/source/prepare_data.md) | <font color=green size=5>✓</font> | <code> data_source=dict(<br/>type='ClsSourceImageNet1k', <br/>root='{root path}', <br/>split='train') </code> | [LICENSE](https://github.com/alibaba/EasyCV/blob/master/docs/source/LICENSE#L1) |
|
||||
| **Imagenet-1k TFrecords**<br/>[url](https://www.kaggle.com/hmendonca/imagenet-1k-tfrecords-ilsvrc2012-part-0) | Common | Original imagenet raw images packed in TFrecord format. | [Baidu Netdisk (提取码:5zdc)](https://pan.baidu.com/s/153SY2dp02vEY9K6-O5U1UA)<br/>refer to [prepare_data.md](https://github.com/alibaba/EasyCV/blob/master/docs/source/prepare_data.md) | <font color=green size=5>✓</font> | <code> data_source=dict(<br/>type='ClsSourceCifar10', <br/>root='{root path}', <br/>list_file={annotation file path}, <br/>split='train') </code> | [LICENSE](https://github.com/alibaba/EasyCV/blob/master/docs/source/LICENSE#L1) |
|
||||
| **ImageNet 21k**<br/>[url](https://image-net.org/download.php) | Common | ImageNet-21K dataset, which is bigger and more diverse, is used less frequently for pretraining, mainly due to its complexity, low accessibility, and underestimation of its added value. | [Baidu Netdisk (提取码:kaeg)](https://pan.baidu.com/s/1eJVPCfS814cDCt3-lVHgmA)<br/>refer to [Alibaba-MIIL/ImageNet21K](https://github.com/Alibaba-MIIL/ImageNet21K/blob/main/dataset_preprocessing/processing_instructions.md) | | | [LICENSE](https://github.com/alibaba/EasyCV/blob/master/docs/source/LICENSE#L1) |
|
||||
| **MNIST**<br/>[url](http://yann.lecun.com/exdb/mnist/) | Handwritten numbers | The MNIST database of handwritten digits, has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger set available from NIST. The digits have been size-normalized and centered in a fixed-size image. | [train-images-idx3-ubyte.gz](http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz) (9.5MB)<br/>[train-labels-idx1-ubyte.gz](http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz)<br/>[t10k-images-idx3-ubyte.gz](http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz) (1.5MB)<br/>[t10k-labels-idx1-ubyte.gz](http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz) | <font color=green size=5>✓</font> | <code> data_source=dict(<br/>type='ClsSourceImageNetTFRecord', <br/>root='{root path}', <br/>download=True)</code> ||
|
||||
| **Fashion-MNIST**<br/>[url](https://github.com/zalandoresearch/fashion-mnist) | Clothing | Fashion-MNIST is a **clothing dataset** of [Zalando](https://jobs.zalando.com/tech/)'s article images—consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes. | [train-images-idx3-ubyte.gz](http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz) (26MB)<br/>[train-labels-idx1-ubyte.gz](http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz) (29KB)<br/>[t10k-images-idx3-ubyte.gz](http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz)(4.3 MB)<br/>[t10k-labels-idx1-ubyte.gz](http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz) (5.1KB) | <font color=green size=5>✓</font> | <code> data_source=dict(<br/>type='ClsSourceFashionMnist', <br/>root='{root path}', <br/>download=True, <br/>split='train')</code> ||
|
||||
| **Flower102**<br/>[url](https://www.robots.ox.ac.uk/~vgg/data/flowers/102/) | Flowers | The Flower102 is consisting of 102 flower categories. The flowers chosen to be flower commonly occuring in the United Kingdom. Each class consists of between 40 and 258 images. | [102flowers.tgz](https://www.robots.ox.ac.uk/~vgg/data/flowers/102/102flowers.tgz) (329MB)<br/>[imagelabels.mat](https://www.robots.ox.ac.uk/~vgg/data/flowers/102/imagelabels.mat)<br/>[setid.mat](https://www.robots.ox.ac.uk/~vgg/data/flowers/102/setid.mat) | <font color=green size=5>✓</font> | <code> data_source=dict(<br/>type='ClsSourceFlowers102', <br/>root='{root path}', <br/>download=True, <br/>split='train') </code> ||
|
||||
| **Caltech 101**<br/>[url](https://data.caltech.edu/records/20086) | Common | Pictures of objects belonging to 101 categories. About 40 to 800 images per category. Most categories have about 50 images. The size of each image is roughly 300 x 200 pixels. | [caltech-101.zip](https://data.caltech.edu/tindfiles/serve/e41f5188-0b32-41fa-801b-d1e840915e80/) (137.4 MB) | <font color=green size=5>✓</font> | <code> data_source=dict(<br/>type='ClsSourceCaltech101', <br/>root='{root path}', <br/>download=True)</code> </code> ||
|
||||
| **Caltech 256**<br/>[url](https://data.caltech.edu/records/20087) | Common | The Caltech-256 is a challenging set of 256 object categories containing a total of 30607 images. Compared to Caltech-101, Caltech-256 has the following improvements: a) the number of categories is more than doubled, b) the minimum number of images in any category is increased from 31 to 80, c) artifacts due to image rotation are avoided and d) a new and larger clutter category is introduced for testing background rejection. | [256_ObjectCategories.tar](https://data.caltech.edu/tindfiles/serve/813641b9-cb42-4e21-9da5-9d24a20bb4a4/) (1.2GB) | <font color=green size=5>✓</font> | <code> data_source=dict(<br/>type='ClsSourceCaltech256', <br/>root='{root path}', <br/>download=True) </code> ||
|
||||
| Name | Field | Describtion | Download | Dataset API support | Mode of use | Licence |
|
||||
|---------------------------------------------------------------------------------------------------------------| ------ | ------------------------------------------------------------ | ------------------------------------------------------------ |-------------------------|---------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------|
|
||||
| **Cifar10**<br/>[url](https://www.cs.toronto.edu/~kriz/cifar.html) | Common | The CIFAR-10 are labeled subsets of the [80 million tiny images](http://people.csail.mit.edu/torralba/tinyimages/) dataset. It consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images. | [cifar-10-python.tar.gz ](https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz)(163MB) | <font color=green size=5>✓</font> | <code> data_source=dict(<br/>type='ClsSourceCifar10', <br/>root='{root path}', <br/>download=True, <br/>split='train') </code> | |
|
||||
| **Cifar100**<br/>[url](https://www.cs.toronto.edu/~kriz/cifar.html) | Common | The CIFAR-100 are labeled subsets of the [80 million tiny images](http://people.csail.mit.edu/torralba/tinyimages/) dataset. It is just like the CIFAR-10, except it has 100 classes containing 600 images each. There are 500 training images and 100 testing images per class. | [cifar-100-python.tar.gz](https://www.cs.toronto.edu/~kriz/cifar-100-python.tar.gz) (161MB) | <font color=green size=5>✓</font> | <code> data_source=dict(<br/>type='ClsSourceCifar100', <br/>root='{root path}', <br/>download=True, <br/>split='train')</code> ||
|
||||
| **ImageNet 1k**<br/>[url](https://image-net.org/download.php) | Common | ImageNet is an image database organized according to the [WordNet](http://wordnet.princeton.edu/) hierarchy (currently only the nouns).It is used in the ImageNet Large Scale Visual Recognition Challenge(ILSVRC) and is a benchmark for image classification. | [Baidu Netdisk (提取码:0zas)](https://pan.baidu.com/s/13pKw0bJbr-jbymQMd_YXzA)<br/>refer to [prepare_data.md](https://github.com/alibaba/EasyCV/blob/master/docs/source/prepare_data.md) | <font color=green size=5>✓</font> | <code> data_source=dict(<br/>type='ClsSourceImageNet1k', <br/>root='{root path}', <br/>split='train') </code> | [LICENSE](https://github.com/alibaba/EasyCV/blob/master/docs/source/LICENSE#L1) |
|
||||
| **Imagenet-1k TFrecords**<br/>[url](https://www.kaggle.com/hmendonca/imagenet-1k-tfrecords-ilsvrc2012-part-0) | Common | Original imagenet raw images packed in TFrecord format. | [Baidu Netdisk (提取码:5zdc)](https://pan.baidu.com/s/153SY2dp02vEY9K6-O5U1UA)<br/>refer to [prepare_data.md](https://github.com/alibaba/EasyCV/blob/master/docs/source/prepare_data.md) | <font color=green size=5>✓</font> | <code> data_source=dict(<br/>type='ClsSourceImageNetTFRecord', <br/>root='{root path}', <br/>list_file={annotation file path})</code> | [LICENSE](https://github.com/alibaba/EasyCV/blob/master/docs/source/LICENSE#L1) |
|
||||
| **ImageNet 21k**<br/>[url](https://image-net.org/download.php) | Common | ImageNet-21K dataset, which is bigger and more diverse, is used less frequently for pretraining, mainly due to its complexity, low accessibility, and underestimation of its added value. | [Baidu Netdisk (提取码:kaeg)](https://pan.baidu.com/s/1eJVPCfS814cDCt3-lVHgmA)<br/>refer to [Alibaba-MIIL/ImageNet21K](https://github.com/Alibaba-MIIL/ImageNet21K/blob/main/dataset_preprocessing/processing_instructions.md) | | | [LICENSE](https://github.com/alibaba/EasyCV/blob/master/docs/source/LICENSE#L1) |
|
||||
| **MNIST**<br/>[url](http://yann.lecun.com/exdb/mnist/) | Handwritten numbers | The MNIST database of handwritten digits, has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger set available from NIST. The digits have been size-normalized and centered in a fixed-size image. | [train-images-idx3-ubyte.gz](http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz) (9.5MB)<br/>[train-labels-idx1-ubyte.gz](http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz)<br/>[t10k-images-idx3-ubyte.gz](http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz) (1.5MB)<br/>[t10k-labels-idx1-ubyte.gz](http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz) | <font color=green size=5>✓</font> | <code> data_source=dict(<br/>type='ClsSourceMnist', <br/>root='{root path}', <br/>split='train', <br/>download=True)</code> ||
|
||||
| **Fashion-MNIST**<br/>[url](https://github.com/zalandoresearch/fashion-mnist) | Clothing | Fashion-MNIST is a **clothing dataset** of [Zalando](https://jobs.zalando.com/tech/)'s article images—consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes. | [train-images-idx3-ubyte.gz](http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz) (26MB)<br/>[train-labels-idx1-ubyte.gz](http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz) (29KB)<br/>[t10k-images-idx3-ubyte.gz](http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz)(4.3 MB)<br/>[t10k-labels-idx1-ubyte.gz](http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz) (5.1KB) | <font color=green size=5>✓</font> | <code> data_source=dict(<br/>type='ClsSourceFashionMnist', <br/>root='{root path}', <br/>download=True, <br/>split='train')</code> ||
|
||||
| **Flower102**<br/>[url](https://www.robots.ox.ac.uk/~vgg/data/flowers/102/) | Flowers | The Flower102 is consisting of 102 flower categories. The flowers chosen to be flower commonly occuring in the United Kingdom. Each class consists of between 40 and 258 images. | [102flowers.tgz](https://www.robots.ox.ac.uk/~vgg/data/flowers/102/102flowers.tgz) (329MB)<br/>[imagelabels.mat](https://www.robots.ox.ac.uk/~vgg/data/flowers/102/imagelabels.mat)<br/>[setid.mat](https://www.robots.ox.ac.uk/~vgg/data/flowers/102/setid.mat) | <font color=green size=5>✓</font> | <code> data_source=dict(<br/>type='ClsSourceFlowers102', <br/>root='{root path}', <br/>download=True, <br/>split='train') </code> ||
|
||||
| **Caltech 101**<br/>[url](https://data.caltech.edu/records/20086) | Common | Pictures of objects belonging to 101 categories. About 40 to 800 images per category. Most categories have about 50 images. The size of each image is roughly 300 x 200 pixels. | [caltech-101.zip](https://data.caltech.edu/tindfiles/serve/e41f5188-0b32-41fa-801b-d1e840915e80/) (137.4 MB) | <font color=green size=5>✓</font> | <code> data_source=dict(<br/>type='ClsSourceCaltech101', <br/>root='{root path}', <br/>download=True)</code> </code> ||
|
||||
| **Caltech 256**<br/>[url](https://data.caltech.edu/records/20087) | Common | The Caltech-256 is a challenging set of 256 object categories containing a total of 30607 images. Compared to Caltech-101, Caltech-256 has the following improvements: a) the number of categories is more than doubled, b) the minimum number of images in any category is increased from 31 to 80, c) artifacts due to image rotation are avoided and d) a new and larger clutter category is introduced for testing background rejection. | [256_ObjectCategories.tar](https://data.caltech.edu/tindfiles/serve/813641b9-cb42-4e21-9da5-9d24a20bb4a4/) (1.2GB) | <font color=green size=5>✓</font> | <code> data_source=dict(<br/>type='ClsSourceCaltech256', <br/>root='{root path}', <br/>download=True) </code> ||
|
||||
|
||||
## Object Detection
|
||||
|
||||
|
@ -45,8 +45,8 @@ Before using dataset, please read the [LICENSE](docs/source/LICENSE) file to lea
|
|||
| **VOC2007**<br/>[url](http://host.robots.ox.ac.uk/pascal/VOC/voc2007/index.html) | Common | PASCAL VOC 2007 is a dataset for image recognition consisting of 20 object categories. Each image in this dataset has pixel-level segmentation annotations, bounding box annotations, and object class annotations. | [VOCtrainval_06-Nov-2007.tar](http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar) (439MB) | <font color=green size=5>✓</font> | <code>data_source=dict(<br/>type='DetSourceVOC2007', <br/>path='{root path}', <br/>download=True, <br/>split='train') </code> | |
|
||||
| **VOC2012**<br/>[url](http://host.robots.ox.ac.uk/pascal/VOC/voc2012/index.html) | Common | From 2009 to 2011, the amount of data is still growing on the basis of the previous year's dataset, and from 2011 to 2012, the amount of data used for classification, detection and person layout tasks does not change. Mainly for segmentation and action recognition, improve the corresponding data subsets and label information. | [Baidu Netdisk (提取码:ro9f)](https://pan.baidu.com/s/1B4tF8cEPIe0xGL1FG0qbkg)<br/>[VOCtrainval_11-May-2012.tar](http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar) (2G) | <font color=green size=5>✓</font> | <code> data_source=dict(<br/>type='DetSourceVOC2012', <br/>path='{root path}', <br/>download=True, <br/>split='train')</code> | [LICENSE](https://github.com/alibaba/EasyCV/blob/master/docs/source/LICENSE#L70) |
|
||||
| **LVIS**<br/>[url](https://www.lvisdataset.org/dataset) | Common | LVIS uses the COCO 2017 train, validation, and test image sets. If you have already downloaded the COCO images, you only need to download the LVIS annotations. LVIS val set contains images from COCO 2017 train in addition to the COCO 2017 val split. | [Baidu Netdisk (提取码:8ief)](https://pan.baidu.com/s/1UntujlgDMuVBIjhoAc_lSA)<br/>refer to [coco](https://cocodataset.org/#overview) | <font color=green size=5>✓</font> | <code>data_source=dict(<br/>type='DetSourceLvis', <br/>path='{root path}', <br/>download=True, <br/>split='train')</code> | [LICENSE](https://github.com/alibaba/EasyCV/blob/master/docs/source/LICENSE#L57) |
|
||||
| **Object365**<br/>[url](https://www.objects365.org/overview.html) | Common | Objects365 is a brand new dataset, designed to spur object detection research with a focus on diverse objects in the Wild. 365 categories, 2 million images, 30 million bounding boxes. | refer to [data-set-detail](https://open.baai.ac.cn/data-set-detail/MTI2NDc=/MTA=/true) | <font color=green size=5>✓</font> | <code>data_source=dict(<br/>type='DetSourceObject365', <br/>ann_file='{annotation file path} ', <br/>imp_prefix = '{images file root path}', <br/>pipeline=[{pipeline parameter}])</code> | |
|
||||
| **CrowdHuman**<br/>[url](https://www.crowdhuman.org/) | Common | CrowdHuman is a benchmark dataset to better evaluate detectors in crowd scenarios. The CrowdHuman dataset is large, rich-annotated and contains high diversity. CrowdHuman contains 15000, 4370 and 5000 images for training, validation, and testing, respectively. There are a total of 470K human instances from train and validation subsets and 23 persons per image, with various kinds of occlusions in the dataset. Each human instance is annotated with a head bounding-box, human visible-region bounding-box and human full-body bounding-box. | refer to [crowdhuman](https://www.crowdhuman.org/) | <font color=green size=5>✓</font> | <code>data_source=dict(<br/>type='DetSourceObject365', <br/>ann_file='{annotation file path} ', <br/>imp_prefix = '{images file root path}', <br/>gt_op='vbox')</code> | |
|
||||
| **Object365**<br/>[url](https://www.objects365.org/overview.html) | Common | Objects365 is a brand new dataset, designed to spur object detection research with a focus on diverse objects in the Wild. 365 categories, 2 million images, 30 million bounding boxes. | refer to [data-set-detail](https://open.baai.ac.cn/data-set-detail/MTI2NDc=/MTA=/true) | <font color=green size=5>✓</font> | <code>data_source=dict(<br/>type='DetSourceObjects365', <br/>ann_file='{annotation file path} ', <br/>imp_prefix = '{images file root path}', <br/>pipeline=[{pipeline parameter}])</code> | |
|
||||
| **CrowdHuman**<br/>[url](https://www.crowdhuman.org/) | Common | CrowdHuman is a benchmark dataset to better evaluate detectors in crowd scenarios. The CrowdHuman dataset is large, rich-annotated and contains high diversity. CrowdHuman contains 15000, 4370 and 5000 images for training, validation, and testing, respectively. There are a total of 470K human instances from train and validation subsets and 23 persons per image, with various kinds of occlusions in the dataset. Each human instance is annotated with a head bounding-box, human visible-region bounding-box and human full-body bounding-box. | refer to [crowdhuman](https://www.crowdhuman.org/) | <font color=green size=5>✓</font> | <code>data_source=dict(<br/>type='DetSourceCrowdHuman', <br/>ann_file='{annotation file path} ', <br/>imp_prefix = '{images file root path}', <br/>gt_op='vbox')</code> | |
|
||||
| **Openimages**<br/>[url](https://storage.googleapis.com/openimages/web/index.html) | Common | Open Images is a dataset of ~9 million URLs to images that have been annotated with image-level labels and bounding boxes spanning thousands of classes. | refer to [cvdfoundation/open-images-dataset](https://github.com/cvdfoundation/open-images-dataset#download-images-with-bounding-boxes-annotations) | | | |
|
||||
| **WIDER FACE**<br/>[url](http://shuoyang1213.me/WIDERFACE/) | Face | The WIDER FACE dataset contains 32,203 images and labels 393,703 faces with a high degree of variability in scale, pose and occlusion. The database is split into training (40%), validation (10%) and testing (50%) set. Besides, the images are divided into three levels (Easy ⊆ Medium ⊆ Hard) according to the difficulties of the detection. | WIDER Face Training Images [[Google Drive\]](https://drive.google.com/file/d/15hGDLhsx8bLgLcIRD5DhYt5iBxnjNF1M/view?usp=sharing) [[Tencent Drive\]](https://share.weiyun.com/5WjCBWV) (1.36GB)<br/>WIDER Face Validation Images [[Google Drive\]](https://drive.google.com/file/d/1GUCogbp16PMGa39thoMMeWxp7Rp5oM8Q/view?usp=sharing) [[Tencent Drive\]](https://share.weiyun.com/5ot9Qv1) (345.95MB)<br/>WIDER Face Testing Images [[Google Drive\]](https://drive.google.com/file/d/1HIfDbVEWKmsYKJZm4lchTBDLW5N7dY5T/view?usp=sharing) [[Tencent Drive\]](https://share.weiyun.com/5vSUomP) (1.72GB)<br/>[Face annotations](http://shuoyang1213.me/WIDERFACE/support/bbx_annotation/wider_face_split.zip) (3.6MB) | <font color=green size=5>✓</font> | <code>data_source=dict(<br/>type='DetSourceWiderFace', <br/>ann_file='{annotation file path} ', <br/>imp_prefix = '{images file root path}')</code> | |
|
||||
| **DeepFashion**<br/>[url](https://mmlab.ie.cuhk.edu.hk/projects/DeepFashion.html) | Clothing | The DeepFashion is a large-scale clothes database. It contains over 800,000 diverse fashion images ranging from well-posed shop images to unconstrained consumer photos. Second, DeepFashion is annotated with rich information of clothing items. Each image in this dataset is labeled with 50 categories, 1,000 descriptive attributes, bounding box and clothing landmarks. Third, DeepFashion contains over 300,000 cross-pose/cross-domain image pairs. | Category and Attribute Prediction Benchmark: [[Download Page\]](https://drive.google.com/drive/folders/0B7EVK8r0v71pQ2FuZ0k0QnhBQnc?resourcekey=0-NWldFxSChFuCpK4nzAIGsg&usp=sharing)<br/>In-shop Clothes Retrieval Benchmark: [[Download Page\]](https://drive.google.com/drive/folders/0B7EVK8r0v71pQ2FuZ0k0QnhBQnc?resourcekey=0-NWldFxSChFuCpK4nzAIGsg&usp=sharing)<br/>Consumer-to-shop Clothes Retrieval Benchmark: [[Download Page\]](https://drive.google.com/drive/folders/0B7EVK8r0v71pQ2FuZ0k0QnhBQnc?resourcekey=0-NWldFxSChFuCpK4nzAIGsg&usp=sharing)<br/>Fashion Landmark Detection Benchmark: [[Download Page\]](https://drive.google.com/drive/folders/0B7EVK8r0v71pQ2FuZ0k0QnhBQnc?resourcekey=0-NWldFxSChFuCpK4nzAIGsg&usp=sharing) | | |
|
||||
|
|
Loading…
Reference in New Issue