122 lines
8.0 KiB
Markdown
122 lines
8.0 KiB
Markdown
|
# DeepCluster
|
||
|
|
||
|
## Deep Clustering for Unsupervised Learning of Visual Features
|
||
|
|
||
|
<!-- [ABSTRACT] -->
|
||
|
|
||
|
Clustering is a class of unsupervised learning methods that has been extensively applied and studied in computer vision. Little work has been done to adapt it to the end-to-end training of visual features on large scale datasets. In this work, we present DeepCluster, a clustering method that jointly learns the parameters of a neural network and the cluster assignments of the resulting features. DeepCluster iteratively groups the features with a standard clustering algorithm, k-means, and uses the subsequent assignments as supervision to update the weights of the network.
|
||
|
|
||
|
<!-- [IMAGE] -->
|
||
|
<div align="center">
|
||
|
<img />
|
||
|
</div>
|
||
|
|
||
|
## Citation
|
||
|
|
||
|
<!-- [ALGORITHM] -->
|
||
|
|
||
|
```bibtex
|
||
|
@inproceedings{caron2018deep,
|
||
|
title={Deep clustering for unsupervised learning of visual features},
|
||
|
author={Caron, Mathilde and Bojanowski, Piotr and Joulin, Armand and Douze, Matthijs},
|
||
|
booktitle={ECCV},
|
||
|
year={2018}
|
||
|
}
|
||
|
```
|
||
|
|
||
|
## Models and Benchmarks
|
||
|
|
||
|
[Back to model_zoo.md](../../../docs/model_zoo.md)
|
||
|
|
||
|
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models were trained on ImageNet1k dataset.
|
||
|
|
||
|
|
||
|
### VOC SVM / Low-shot SVM
|
||
|
|
||
|
The **Best Layer** indicates that the best results are obtained from which layers feature map. For example, if the **Best Layer** is **feature3**, its best result is obtained from the second stage of ResNet (1 for stem layer, 2-5 for 4 stage layers).
|
||
|
|
||
|
Besides, k=1 to 96 indicates the hyper-parameter of Low-shot SVM.
|
||
|
|
||
|
| Model | Config | Best Layer | SVM | k=1 | k=2 | k=4 | k=8 | k=16 | k=32 | k=64 | k=96 |
|
||
|
| --------- | ---------------------------------------------------------------------------- | ---------- | --- | --- | --- | --- | --- | ---- | ---- | ---- | ---- |
|
||
|
| [model]() | [resnet50_8xb64-steplr-200e](deepcluster_resnet50_8xb64-steplr-200e_in1k.py) | | | | | | | | | | |
|
||
|
|
||
|
### ImageNet Linear Evaluation
|
||
|
|
||
|
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_8xb32-steplr-90e.py](../../benchmarks/classification/imagenet/resnet50_mhead_8xb32-steplr-90e_in1k.py) for details of config.
|
||
|
|
||
|
The **AvgPool** result is obtained from Linear Evaluation with GlobalAveragePooling. Please refer to [file name]() for details of config.
|
||
|
|
||
|
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
|
||
|
| --------- | ---------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
|
||
|
| [model]() | [resnet50_8xb64-steplr-200e](deepcluster_resnet50_8xb64-steplr-200e_in1k.py) | | | | | | |
|
||
|
|
||
|
### iNaturalist2018 Linear Evaluation
|
||
|
|
||
|
Please refer to [resnet50_mhead_8xb32-steplr-84e_inat18.py](../../benchmarks/classification/inaturalist2018/resnet50_mhead_8xb32-steplr-84e_inat18.py) and [file name]() for details of config.
|
||
|
|
||
|
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
|
||
|
| --------- | ---------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
|
||
|
| [model]() | [resnet50_8xb64-steplr-200e](deepcluster_resnet50_8xb64-steplr-200e_in1k.py) | | | | | | |
|
||
|
|
||
|
### Places205 Linear Evaluation
|
||
|
|
||
|
Please refer to [resnet50_mhead_8xb32-steplr-28e_places205.py](../../benchmarks/classification/inaturalist2018/resnet50_mhead_8xb32-steplr-28e_places205.py) and [file name]() for details of config.
|
||
|
|
||
|
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
|
||
|
| --------- | ---------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
|
||
|
| [model]() | [resnet50_8xb64-steplr-200e](deepcluster_resnet50_8xb64-steplr-200e_in1k.py) | | | | | | |
|
||
|
|
||
|
#### Semi-Supervised Classification
|
||
|
|
||
|
- In this benchmark, the necks or heads are removed and only the backbone CNN is evaluated by appending a linear classification head. All parameters are fine-tuned.
|
||
|
- When training with 1% ImageNet, we find hyper-parameters especially the learning rate greatly influence the performance. Hence, we prepare a list of settings with the base learning rate from `{0.001, 0.01, 0.1}` and the learning rate multiplier for the head from `{1, 10, 100}`. We choose the best performing setting for each method. The setting of parameters are indicated in the file name. The learning rate is indicated like `1e-1`, `1e-2`, `1e-3` and the learning rate multiplier is indicated like `head1`, `head10`, `head100`.
|
||
|
- Please use --deterministic in this benchmark.
|
||
|
|
||
|
Please refer to the directories `configs/benchmarks/classification/imagenet/imagenet_1percent/` of 1% data and `configs/benchmarks/classification/imagenet/imagenet_10percent/` 10% data for details.
|
||
|
|
||
|
| Model | Pretrain Config | Fine-tuned Config | Top-1 (%) | Top-5 (%) |
|
||
|
| --------- | ---------------------------------------------------------------------------- | ----------------- | --------- | --------- |
|
||
|
| [model]() | [resnet50_8xb64-steplr-200e](deepcluster_resnet50_8xb64-steplr-200e_in1k.py) | | | |
|
||
|
|
||
|
### Detection
|
||
|
|
||
|
The detection benchmarks includes 2 downstream task datasets, **Pascal VOC 2007 + 2012** and **COCO2017**. This benchmark follows the evluation protocols set up by MoCo.
|
||
|
|
||
|
#### Pascal VOC 2007 + 2012
|
||
|
|
||
|
Please refer to [faster_rcnn_r50_c4_mstrain_24k.py](../../benchmarks/mmdetection/voc0712/faster_rcnn_r50_c4_mstrain_24k.py) for details of config.
|
||
|
|
||
|
| Model | Config | mAP | AP50 |
|
||
|
| --------- | ---------------------------------------------------------------------------- | --- | ---- |
|
||
|
| [model]() | [resnet50_8xb64-steplr-200e](deepcluster_resnet50_8xb64-steplr-200e_in1k.py) | | |
|
||
|
|
||
|
#### COCO2017
|
||
|
|
||
|
Please refer to [mask_rcnn_r50_fpn_mstrain_1x.py](../../benchmarks/mmdetection/coco/mask_rcnn_r50_fpn_mstrain_1x.py) for details of config.
|
||
|
|
||
|
| Model | Config | mAP(Box) | AP50(Box) | AP75(Box) | mAP(Mask) | AP50(Mask) | AP75(Mask) |
|
||
|
| --------- | ---------------------------------------------------------------------------- | -------- | --------- | --------- | --------- | ---------- | ---------- |
|
||
|
| [model]() | [resnet50_8xb64-steplr-200e](deepcluster_resnet50_8xb64-steplr-200e_in1k.py) | | | | | | |
|
||
|
|
||
|
### Segmentation
|
||
|
|
||
|
The segmentation benchmarks includes 2 downstream task datasets, **Cityscapes** and **Pascal VOC 2012 + Aug**. It follows the evluation protocols set up by MMSegmentation.
|
||
|
|
||
|
#### Pascal VOC 2012 + Aug
|
||
|
|
||
|
Please refer to [file]() for details of config.
|
||
|
|
||
|
| Model | Config | mIOU |
|
||
|
| --------- | ---------------------------------------------------------------------------- | ---- |
|
||
|
| [model]() | [resnet50_8xb64-steplr-200e](deepcluster_resnet50_8xb64-steplr-200e_in1k.py) | |
|
||
|
|
||
|
|
||
|
#### Cityscapes
|
||
|
|
||
|
Please refer to [file]() for details of config.
|
||
|
|
||
|
| Model | Config | mIOU |
|
||
|
| --------- | ---------------------------------------------------------------------------- | ---- |
|
||
|
| [model]() | [resnet50_8xb64-steplr-200e](deepcluster_resnet50_8xb64-steplr-200e_in1k.py) | |
|