Bump version to v0.9.1 (#322)

* [Fix]: Set qkv bias to False for cae and True for mae (#303)

* [Fix]: Add mmcls transformer layer choice

* [Fix]: Fix transformer encoder layer bug

* [Fix]: Change UT of cae

* [Feature]: Change the file name of cosine annealing hook (#304)

* [Feature]: Change cosine annealing hook file name

* [Feature]: Add UT for cosine annealing hook

* [Fix]: Fix lint

* read tutorials and fix typo (#308)

* [Fix] fix config errors in MAE (#307)

* update readthedocs algorithm readme (#310)

* [Docs] Replace markdownlint with mdformat (#311)

* Replace markdownlint with mdformat to avoid installing ruby

* fix typo

* add 'ba' to codespell ignore-words-list

* Configure Myst-parser to parse anchor tag (#309)

* [Docs] rewrite install.md (#317)

* rewrite the install.md

* add faq.md

* fix lint

* add FAQ to README

* add Chinese version

* fix typo

* fix format

* remove modification

* fix format

* [Docs] refine README.md file (#318)

* refine README.md file

* fix lint

* format language button

* rename getting_started.md

* revise index.rst

* add model_zoo.md to index.rst

* fix lint

* refine readme

Co-authored-by: Jiahao Xie <52497952+Jiahao000@users.noreply.github.com>

* [Enhance] update byol models and results (#319)

* Update version information (#321)

Co-authored-by: Yuan Liu <30762564+YuanLiuuuuuu@users.noreply.github.com>
Co-authored-by: Yi Lu <21515006@zju.edu.cn>
Co-authored-by: RenQin <45731309+soonera@users.noreply.github.com>
Co-authored-by: Jiahao Xie <52497952+Jiahao000@users.noreply.github.com>
pull/326/head v0.9.1
Yixiao Fang 2022-06-01 09:59:05 +08:00 committed by GitHub
parent 399b5a0d6e
commit bbdf670d00
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
86 changed files with 2909 additions and 3366 deletions

1
.github/CONTRIBUTING.md vendored 100644
View File

@ -0,0 +1 @@
We appreciate all contributions to improve MMSelfSup. Please refer to [CONTRIBUTING.md](https://github.com/open-mmlab/mmcv/blob/master/CONTRIBUTING.md) in MMCV for more details about the contributing guideline.

View File

@ -4,15 +4,14 @@ about: Suggest an idea for this project
title: ''
labels: ''
assignees: ''
---
**Describe the feature**
**Motivation**
A clear and concise description of the motivation of the feature.
Ex1. It is inconvenient when [....].
Ex2. There is a recent paper [....], which is very helpful for [....].
Ex1. It is inconvenient when \[....\].
Ex2. There is a recent paper \[....\], which is very helpful for \[....\].
**Related resources**
If there is an official code release or third-party implementations, please also provide the information here, which would be very helpful.

View File

@ -4,7 +4,6 @@ about: Ask general questions to get help
title: ''
labels: ''
assignees: ''
---
**Checklist**

View File

@ -4,7 +4,6 @@ about: Create a report to help us improve
title: ''
labels: ''
assignees: ''
---
Thanks for reporting the unexpected results and we appreciate it a lot.
@ -31,8 +30,8 @@ A placeholder for the command.
1. Please run `python -c "from mmselfsup.utils import collect_env; print(collect_env())"` to collect necessary environment information and paste it here.
2. You may add addition that may be helpful for locating the problem, such as
- How you installed PyTorch [e.g., pip, conda, source]
- Other environment variables that may be related (such as `$PATH`, `$LD_LIBRARY_PATH`, `$PYTHONPATH`, etc.)
- How you installed PyTorch \[e.g., pip, conda, source\]
- Other environment variables that may be related (such as `$PATH`, `$LD_LIBRARY_PATH`, `$PYTHONPATH`, etc.)
**Error traceback**
If applicable, paste the error traceback here.

View File

@ -1,7 +1,7 @@
exclude: ^tests/data/
repos:
- repo: https://gitlab.com/pycqa/flake8.git
rev: 3.8.3
- repo: https://github.com/PyCQA/flake8
rev: 4.0.1
hooks:
- id: flake8
- repo: https://github.com/timothycrosley/isort
@ -26,11 +26,6 @@ repos:
args: ["--remove"]
- id: mixed-line-ending
args: ["--fix=lf"]
- repo: https://github.com/markdownlint/markdownlint
rev: v0.11.0
hooks:
- id: markdownlint
args: ["-r", "~MD002,~MD013,~MD024,~MD029,~MD033,~MD034,~MD036", "-t", "allow_different_nesting"]
- repo: https://github.com/codespell-project/codespell
rev: v2.1.0
hooks:
@ -40,6 +35,15 @@ repos:
hooks:
- id: docformatter
args: ["--in-place", "--wrap-descriptions", "79"]
- repo: https://github.com/executablebooks/mdformat
rev: 0.7.14
hooks:
- id: mdformat
args: ["--number"]
additional_dependencies:
- mdformat-gfm
- mdformat_frontmatter
- linkify-it-py
- repo: https://github.com/open-mmlab/pre-commit-hooks
rev: v0.2.0
hooks:

117
README.md
View File

@ -30,17 +30,21 @@
[👀Model Zoo](https://github.com/open-mmlab/mmselfsup/blob/master/docs/en/model_zoo.md) |
[🆕Update News](https://mmselfsup.readthedocs.io/en/latest/changelog.html) |
[🤔Reporting Issues](https://github.com/open-mmlab/mmselfsup/issues/new/choose)
</div>
<div align="center">
English | [简体中文](README_zh-CN.md)
</div>
## Introduction
English | [简体中文](README_zh-CN.md)
MMSelfSup is an open source self-supervised representation learning toolbox based on PyTorch. It is a part of the [OpenMMLab](https://openmmlab.com/) project.
The master branch works with **PyTorch 1.5** or higher.
### Major features
- **Methods All in One**
@ -59,34 +63,52 @@ The master branch works with **PyTorch 1.5** or higher.
Since MMSelfSup adopts similar design of modulars and interfaces as those in other OpenMMLab projects, it supports smooth evaluation on downstream tasks with other OpenMMLab projects like object detection and segmentation.
## What's New
## License
This project is released under the [Apache 2.0 license](LICENSE).
## ChangeLog
MMSelfSup **v0.9.0** was released in 29/04/2022.
MMSelfSup **v0.9.1** was released in 31/05/2022.
Highlights of the new version:
* Support **CAE**
* Support **Barlow Twins**
- Update **BYOL** model and results
- Refine some documentation
Please refer to [changelog.md](docs/en/changelog.md) for details and release history.
Differences between MMSelfSup and OpenSelfSup codebases can be found in [compatibility.md](docs/en/compatibility.md).
## Model Zoo and Benchmark
## Installation
MMSelfSup depends on [PyTorch](https://pytorch.org/), [MMCV](https://github.com/open-mmlab/mmcv) and [MMClassification](https://github.com/open-mmlab/mmclassification).
Please refer to [install.md](docs/en/install.md) for more detailed instruction.
## Get Started
Please refer to [prepare_data.md](docs/en/prepare_data.md) for dataset preparation and [get_started.md](docs/en/get_started.md) for the basic usage of MMSelfSup.
We also provides tutorials for more details:
- [config](docs/en/tutorials/0_config.md)
- [add new dataset](docs/en/tutorials/1_new_dataset.md)
- [data pipeline](docs/en/tutorials/2_data_pipeline.md)
- [add new module](docs/en/tutorials/3_new_module.md)
- [customize schedules](docs/en/tutorials/4_schedule.md)
- [customize runtime](docs/en/tutorials/5_runtime.md)
- [benchmarks](docs/en/tutorials/6_benchmarks.md)
Besides, we provide [colab tutorial](https://github.com/open-mmlab/mmselfsup/blob/master/demo/mmselfsup_colab_tutorial.ipynb) for basic usage.
Please refer to [FAQ](docs/en/faq.md) for frequently asked questions.
## Model Zoo
### Model Zoo
Please refer to [model_zoo.md](docs/en/model_zoo.md) for a comprehensive set of pre-trained models and benchmarks.
Supported algorithms:
- [x] [Relative Location (ICCV'2015)](https://arxiv.org/abs/1505.05192)
- [x] [Rotation Prediction (ICLR'2018)](https://arxiv.org/abs/1803.07728)
- [x] [DeepCLuster (ECCV'2018)](https://arxiv.org/abs/1807.05520)
- [x] [DeepCluster (ECCV'2018)](https://arxiv.org/abs/1807.05520)
- [x] [NPID (CVPR'2018)](https://arxiv.org/abs/1805.01978)
- [x] [ODC (CVPR'2020)](https://arxiv.org/abs/2006.10645)
- [x] [MoCo v1 (CVPR'2020)](https://arxiv.org/abs/1911.05722)
@ -104,43 +126,32 @@ Supported algorithms:
More algorithms are in our plan.
### Benchmark
## Benchmark
| Benchmarks | Setting |
| -------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| ImageNet Linear Classification (Multi-head) | [Goyal2019](http://openaccess.thecvf.com/content_ICCV_2019/papers/Goyal_Scaling_and_Benchmarking_Self-Supervised_Visual_Representation_Learning_ICCV_2019_paper.pdf) |
| ImageNet Linear Classification (Last) | |
| ImageNet Semi-Sup Classification | |
| Places205 Linear Classification (Multi-head) | [Goyal2019](http://openaccess.thecvf.com/content_ICCV_2019/papers/Goyal_Scaling_and_Benchmarking_Self-Supervised_Visual_Representation_Learning_ICCV_2019_paper.pdf) |
| iNaturalist2018 Linear Classification (Multi-head) | [Goyal2019](http://openaccess.thecvf.com/content_ICCV_2019/papers/Goyal_Scaling_and_Benchmarking_Self-Supervised_Visual_Representation_Learning_ICCV_2019_paper.pdf) |
| PASCAL VOC07 SVM | [Goyal2019](http://openaccess.thecvf.com/content_ICCV_2019/papers/Goyal_Scaling_and_Benchmarking_Self-Supervised_Visual_Representation_Learning_ICCV_2019_paper.pdf) |
| PASCAL VOC07 Low-shot SVM | [Goyal2019](http://openaccess.thecvf.com/content_ICCV_2019/papers/Goyal_Scaling_and_Benchmarking_Self-Supervised_Visual_Representation_Learning_ICCV_2019_paper.pdf) |
| PASCAL VOC07+12 Object Detection | [MoCo](http://openaccess.thecvf.com/content_CVPR_2020/papers/He_Momentum_Contrast_for_Unsupervised_Visual_Representation_Learning_CVPR_2020_paper.pdf) |
| COCO17 Object Detection | [MoCo](http://openaccess.thecvf.com/content_CVPR_2020/papers/He_Momentum_Contrast_for_Unsupervised_Visual_Representation_Learning_CVPR_2020_paper.pdf) |
| Cityscapes Segmentation | [MMSeg](configs/benchmarks/mmsegmentation/cityscapes/fcn_r50-d8_769x769_40k_cityscapes.py) |
| PASCAL VOC12 Aug Segmentation | [MMSeg](configs/benchmarks/mmsegmentation/voc12aug/fcn_r50-d8_512x512_20k_voc12aug.py) |
| Benchmarks | Setting |
| -------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| ImageNet Linear Classification (Multi-head) | [Goyal2019](http://openaccess.thecvf.com/content_ICCV_2019/papers/Goyal_Scaling_and_Benchmarking_Self-Supervised_Visual_Representation_Learning_ICCV_2019_paper.pdf) |
| ImageNet Linear Classification (Last) | |
| ImageNet Semi-Sup Classification | |
| Places205 Linear Classification (Multi-head) | [Goyal2019](http://openaccess.thecvf.com/content_ICCV_2019/papers/Goyal_Scaling_and_Benchmarking_Self-Supervised_Visual_Representation_Learning_ICCV_2019_paper.pdf) |
| iNaturalist2018 Linear Classification (Multi-head) | [Goyal2019](http://openaccess.thecvf.com/content_ICCV_2019/papers/Goyal_Scaling_and_Benchmarking_Self-Supervised_Visual_Representation_Learning_ICCV_2019_paper.pdf) |
| PASCAL VOC07 SVM | [Goyal2019](http://openaccess.thecvf.com/content_ICCV_2019/papers/Goyal_Scaling_and_Benchmarking_Self-Supervised_Visual_Representation_Learning_ICCV_2019_paper.pdf) |
| PASCAL VOC07 Low-shot SVM | [Goyal2019](http://openaccess.thecvf.com/content_ICCV_2019/papers/Goyal_Scaling_and_Benchmarking_Self-Supervised_Visual_Representation_Learning_ICCV_2019_paper.pdf) |
| PASCAL VOC07+12 Object Detection | [MoCo](http://openaccess.thecvf.com/content_CVPR_2020/papers/He_Momentum_Contrast_for_Unsupervised_Visual_Representation_Learning_CVPR_2020_paper.pdf) |
| COCO17 Object Detection | [MoCo](http://openaccess.thecvf.com/content_CVPR_2020/papers/He_Momentum_Contrast_for_Unsupervised_Visual_Representation_Learning_CVPR_2020_paper.pdf) |
| Cityscapes Segmentation | [MMSeg](configs/benchmarks/mmsegmentation/cityscapes/fcn_r50-d8_769x769_40k_cityscapes.py) |
| PASCAL VOC12 Aug Segmentation | [MMSeg](configs/benchmarks/mmsegmentation/voc12aug/fcn_r50-d8_512x512_20k_voc12aug.py) |
## Installation
## Contributing
MMSelfSup depends on [PyTorch](https://pytorch.org/), [MMCV](https://github.com/open-mmlab/mmcv) and [MMClassification](https://github.com/open-mmlab/mmclassification).
We appreciate all contributions improving MMSelfSup. Please refer to [CONTRIBUTING.md](.github/CONTRIBUTING.md) for more details about the contributing guideline.
Please refer to [install.md](docs/en/install.md) for more detailed instruction.
## Acknowledgement
## Get Started
MMSelfSup is an open source project that is contributed by researchers and engineers from various colleges and companies. We appreciate all the contributors who implement their methods or add new features, as well as users who give valuable feedbacks.
We wish that the toolbox and benchmark could serve the growing research community by providing a flexible toolkit to reimplement existing methods and develop their own new algorithms.
Please refer to [prepare_data.md](docs/en/prepare_data.md) for dataset preparation and [getting_started.md](docs/en/getting_started.md) for the basic usage of MMSelfSup.
We also provides tutorials for more details:
- [config](docs/en/tutorials/0_config.md)
- [add new dataset](docs/en/tutorials/1_new_dataset.md)
- [data pipeline](docs/en/tutorials/2_data_pipeline.md)
- [add new module](docs/en/tutorials/3_new_module.md)
- [customize schedules](docs/en/tutorials/4_schedule.md)
- [customize runtime](docs/en/tutorials/5_runtime.md)
- [benchmarks](docs/en/tutorials/6_benchmarks.md)
Besides, we provide [colab tutorial](https://github.com/open-mmlab/mmselfsup/blob/master/demo/mmselfsup_colab_tutorial.ipynb) for basic usage.
MMSelfSup originates from OpenSelfSup, and we appreciate all early contributions made to OpenSelfSup. A few contributors are listed here: Xiaohang Zhan ([@XiaohangZhan](http://github.com/XiaohangZhan)), Jiahao Xie ([@Jiahao000](https://github.com/Jiahao000)), Enze Xie ([@xieenze](https://github.com/xieenze)), Xiangxiang Chu ([@cxxgtxy](https://github.com/cxxgtxy)), Zijian He ([@scnuhealthy](https://github.com/scnuhealthy)).
## Citation
@ -155,19 +166,9 @@ If you use this toolbox or benchmark in your research, please cite this project.
}
```
## Contributing
## License
We appreciate all contributions improving MMSelfSup. Please refer to [CONTRIBUTING.md](docs/en/community/CONTRIBUTING.md) for more details about the contributing guideline.
## Acknowledgement
Remarks:
- MMSelfSup originates from OpenSelfSup, and we appreciate all early contributions made to OpenSelfSup. A few contributors are listed here: Xiaohang Zhan, Jiahao Xie, Enze Xie, Xiangxiang Chu, Zijian He.
- The implementation of MoCo and the detection benchmark borrow the code from [MoCo](https://github.com/facebookresearch/moco).
- The implementation of SwAV borrows the code from [SwAV](https://github.com/facebookresearch/swav).
- The SVM benchmark borrows the code from [fair_self_supervision_benchmark](https://github.com/facebookresearch/fair_self_supervision_benchmark).
- `mmselfsup/utils/clustering.py` is borrowed from [deepcluster](https://github.com/facebookresearch/deepcluster/blob/master/clustering.py).
This project is released under the [Apache 2.0 license](LICENSE).
## Projects in OpenMMLab

View File

@ -30,12 +30,17 @@
[👀模型库](https://github.com/open-mmlab/mmselfsup/blob/master/docs/zh_cn/model_zoo.md) |
[🆕更新日志](https://mmselfsup.readthedocs.io/zh_CN/latest/changelog.html) |
[🤔报告问题](https://github.com/open-mmlab/mmselfsup/issues/new/choose)
</div>
<div align="center">
[English](README.md) | 简体中文
</div>
## 介绍
[English](README.md) | 简体中文
MMSelfSup 是一个基于 PyTorch 实现的开源自监督表征学习工具箱,是 [OpenMMLab](https://openmmlab.com/) 项目成员之一。
主分支代码支持 **PyTorch 1.5** 及以上的版本。
@ -58,26 +63,44 @@ MMSelfSup 是一个基于 PyTorch 实现的开源自监督表征学习工具箱
兼容 OpenMMLab 各大算法库,拥有丰富的下游评测任务和预训练模型的应用。
## 开源许可证
## 更新
该项目采用 [Apache 2.0 开源许可证](LICENSE).
## 更新日志
最新的 **v0.9.0** 版本已经在 2022.04.29 发布。
最新的 **v0.9.1** 版本已经在 2022.05.31 发布。
新版本亮点:
* 支持 **CAE**
* 支持 **Barlow Twins**
- 更新 **BYOL** 模型和结果
- 更新优化部分文档
请参考 [更新日志](docs/zh_cn/changelog.md) 获取更多细节和历史版本信息。
MMSelfSup 和 OpenSelfSup 的不同点写在 [对比文档](docs/en/compatibility.md) 中。
## 模型库和基准测试
## 安装
### 模型库
MMSelfSup 依赖 [PyTorch](https://pytorch.org/), [MMCV](https://github.com/open-mmlab/mmcv) 和 [MMClassification](https://github.com/open-mmlab/mmclassification).
请参考 [安装文档](docs/zh_cn/install.md) 获取更详细的安装指南。
## 快速入门
请参考 [准备数据](docs/zh_cn/prepare_data.md) 准备数据集和 [入门指南](docs/zh_cn/get_started.md) 获取 MMSelfSup 的基本使用方法.
我们也提供了更加全面的教程,包括:
- [配置文件](docs/zh_cn/tutorials/0_config.md)
- [添加数据集](docs/zh_cn/tutorials/1_new_dataset.md)
- [数据处理流](docs/zh_cn/tutorials/2_data_pipeline.md)
- [添加新模块](docs/zh_cn/tutorials/3_new_module.md)
- [自定义流程](docs/zh_cn/tutorials/4_schedule.md)
- [自定义运行](docs/zh_cn/tutorials/5_runtime.md)
- [基准测试](docs/zh_cn/tutorials/6_benchmarks.md)
另外,我们提供了 [colab 教程](https://github.com/open-mmlab/mmselfsup/blob/master/demo/mmselfsup_colab_tutorial.ipynb)。
如果遇到问题,请参考 [常见问题解答](docs/zh_cn/faq.md)。
## 模型库
请参考 [模型库](docs/zh_cn/model_zoo.md) 查看我们更加全面的模型基准结果。
@ -103,9 +126,9 @@ MMSelfSup 和 OpenSelfSup 的不同点写在 [对比文档](docs/en/compatibilit
更多的算法实现已经在我们的计划中。
### 基准测试
## 基准测试
| 基准测试方法 | 参考设置 |
| 基准测试方法 | 参考设置 |
| -------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| ImageNet Linear Classification (Multi-head) | [Goyal2019](http://openaccess.thecvf.com/content_ICCV_2019/papers/Goyal_Scaling_and_Benchmarking_Self-Supervised_Visual_Representation_Learning_ICCV_2019_paper.pdf) |
| ImageNet Linear Classification (Last) | |
@ -119,31 +142,9 @@ MMSelfSup 和 OpenSelfSup 的不同点写在 [对比文档](docs/en/compatibilit
| Cityscapes Segmentation | [MMSeg](configs/benchmarks/mmsegmentation/cityscapes/fcn_r50-d8_769x769_40k_cityscapes.py) |
| PASCAL VOC12 Aug Segmentation | [MMSeg](configs/benchmarks/mmsegmentation/voc12aug/fcn_r50-d8_512x512_20k_voc12aug.py) |
## 安装
MMSelfSup 依赖 [PyTorch](https://pytorch.org/), [MMCV](https://github.com/open-mmlab/mmcv) 和 [MMClassification](https://github.com/open-mmlab/mmclassification).
请参考 [安装文档](docs/zh_cn/install.md) 获取更详细的安装指南。
## 快速入门
请参考 [准备数据](docs/zh_cn/prepare_data.md) 准备数据集和 [入门指南](docs/zh_cn/getting_started.md) 获取 MMSelfSup 的基本使用方法.
我们也提供了更加全面的教程,包括:
- [配置文件](docs/zh_cn/tutorials/0_config.md)
- [添加数据集](docs/zh_cn/tutorials/1_new_dataset.md)
- [数据处理流](docs/zh_cn/tutorials/2_data_pipeline.md)
- [添加新模块](docs/zh_cn/tutorials/3_new_module.md)
- [自定义流程](docs/zh_cn/tutorials/4_schedule.md)
- [自定义运行](docs/zh_cn/tutorials/5_runtime.md)
- [基准测试](docs/zh_cn/tutorials/6_benchmarks.md)
另外,我们提供了 [colab 教程](https://github.com/open-mmlab/mmselfsup/blob/master/demo/mmselfsup_colab_tutorial.ipynb)。
## 参与贡献
我们非常欢迎任何有助于提升 MMSelfSup 的贡献,请参考 [贡献指南](docs/zh_cn/community/CONTRIBUTING.md) 来了解如何参与贡献。
我们非常欢迎任何有助于提升 MMSelfSup 的贡献,请参考 [贡献指南](.github/CONTRIBUTING.md) 来了解如何参与贡献。
## 致谢
@ -152,6 +153,7 @@ MMSelfSup 是一款由不同学校和公司共同贡献的开源项目,我们
我们希望该工具箱和基准测试可以为社区提供灵活的代码工具,供用户复现现有算法并开发自己的新模型,从而不断为开源社区提供贡献。
## 引用
如果您发现此项目对您的研究有用,请考虑引用:
```bibtex
@ -163,6 +165,10 @@ MMSelfSup 是一款由不同学校和公司共同贡献的开源项目,我们
}
```
## 开源许可证
该项目采用 [Apache 2.0 开源许可证](LICENSE)。
## OpenMMLab 的其他项目
- [MMCV](https://github.com/open-mmlab/mmcv): OpenMMLab 计算机视觉基础库

View File

@ -1,7 +1,7 @@
_base_ = 'vit-base-p16_ft-8xb128-coslr-100e_in1k.py'
# model
model = dict(backbone=dict(use_window=True, init_values=0.1))
model = dict(backbone=dict(use_window=True, init_values=0.1, qkv_bias=False))
# optimizer
optimizer = dict(lr=8e-3)

View File

@ -1,7 +1,12 @@
# model settings
model = dict(
type='CAE',
backbone=dict(type='CAEViT', arch='b', patch_size=16, init_values=0.1),
backbone=dict(
type='CAEViT',
arch='b',
patch_size=16,
init_values=0.1,
qkv_bias=False),
neck=dict(
type='CAENeck',
patch_size=16,

View File

@ -36,11 +36,12 @@ Besides, k=1 to 96 indicates the hyper-parameter of Low-shot SVM.
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_linear-8xb32-steplr-90e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/resnet50_mhead_linear-8xb32-steplr-90e_in1k.py) for details of config.
The **AvgPool** result is obtained from Linear Evaluation with GlobalAveragePooling. Please refer to [resnet50_linear-8xb32-steplr-100e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/resnet50_linear-8xb32-steplr-100e_in1k.py) for details of config.
The **AvgPool** result is obtained from Linear Evaluation with GlobalAveragePooling. Please refer to [resnet50_linear-8xb512-coslr-90e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/resnet50_linear-8xb512-coslr-90e_in1k.py) for details of config.
| Self-Supervised Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------ | -------- | -------- | -------- | -------- | -------- | ------- |
| [resnet50_8xb32-accum16-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/byol/byol_resnet50_8xb32-accum16-coslr-200e_in1k.py) | 15.59 | 35.16 | 47.37 | 62.86 | 71.62 | 67.55 |
| [resnet50_8xb32-accum16-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/byol/byol_resnet50_8xb32-accum16-coslr-200e_in1k.py) | 15.16 | 35.26 | 47.77 | 63.10 | 71.21 | 71.72 |
| [resnet50_16xb256-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/byol/byol_resnet50_16xb256-coslr-200e_in1k.py) | 15.41 | 35.15 | 47.77 | 62.59 | 71.85 | 71.88 |
#### Places205 Linear Evaluation

View File

@ -0,0 +1,35 @@
_base_ = [
'../_base_/models/byol.py',
'../_base_/datasets/imagenet_byol.py',
'../_base_/schedules/lars_coslr-200e_in1k.py',
'../_base_/default_runtime.py',
]
# dataset summary
data = dict(samples_per_gpu=256, workers_per_gpu=8)
# additional hooks
# interval for accumulate gradient, total 16*256*1(interval)=4096
update_interval = 1
custom_hooks = [
dict(type='BYOLHook', end_momentum=1., update_interval=update_interval)
]
# optimizer
optimizer = dict(
type='LARS',
lr=4.8,
momentum=0.9,
weight_decay=1e-6,
paramwise_options={
'(bn|gn)(\\d+)?.(weight|bias)':
dict(weight_decay=0., lars_exclude=True),
'bias': dict(weight_decay=0., lars_exclude=True)
})
optimizer_config = dict(update_interval=update_interval)
# runtime settings
# the max_keep_ckpts controls the max number of ckpt file in your work_dirs
# if it is 3, when CheckpointHook (in mmcv) saves the 4th ckpt
# it will remove the oldest one to keep the number of total ckpts as 3
checkpoint_config = dict(interval=10, max_keep_ckpts=3)

View File

@ -4,7 +4,7 @@ Collections:
Training Data: ImageNet-1k
Training Techniques:
- LARS
Training Resources: 8x V100 GPUs
Training Resources: 8x V100 GPUs (b256), 16x A100-80G GPUs (b4096)
Architecture:
- ResNet
- BYOL
@ -23,9 +23,21 @@ Models:
- Task: Self-Supervised Image Classification
Dataset: ImageNet-1k
Metrics:
Top 1 Accuracy: 67.55
Top 1 Accuracy: 71.72
Config: configs/selfsup/byol/byol_resnet50_8xb32-accum16-coslr-200e_in1k.py
Weights: https://download.openmmlab.com/mmselfsup/byol/byol_resnet50_8xb32-accum16-coslr-200e_in1k_20220225-5c8b2c2e.pth
- Name: byol_resnet50_16xb256-coslr-200e_in1k
In Collection: BYOL
Metadata:
Epochs: 200
Batch Size: 4096
Results:
- Task: Self-Supervised Image Classification
Dataset: ImageNet-1k
Metrics:
Top 1 Accuracy: 71.88
Config: configs/selfsup/byol/byol_resnet50_16xb256-coslr-200e_in1k.py
Weights: https://download.openmmlab.com/mmselfsup/byol/byol_resnet50_16xb256-coslr-200e_in1k_20220527-b6f8eedd.pth
- Name: byol_resnet50_8xb32-accum16-coslr-300e_in1k
In Collection: BYOL
Metadata:
@ -35,6 +47,6 @@ Models:
- Task: Self-Supervised Image Classification
Dataset: ImageNet-1k
Metrics:
Top 1 Accuracy: 68.55
Config: configs/selfsup/byol/byol_resnet50_8xb32-accum16-coslr-200e_in1k.py
Top 1 Accuracy: 72.93
Config: configs/selfsup/byol/byol_resnet50_8xb32-accum16-coslr-300e_in1k.py
Weights: https://download.openmmlab.com/mmselfsup/byol/byol_resnet50_8xb32-accum16-coslr-300e_in1k_20220225-a0daa54a.pth

View File

@ -12,22 +12,19 @@ We present a novel masked image modeling (MIM) approach, context autoencoder (CA
<img src="https://user-images.githubusercontent.com/30762564/165459947-6c6ef13c-0593-4765-b44e-6da0a079802a.png" width="40%"/>
</div>
## Prerequisite
Create a new folder ``cae_ckpt`` under the root directory and download the
[weights](https://download.openmmlab.com/mmselfsup/cae/dalle_encoder.pth) for ``dalle`` encoder to that folder
Create a new folder `cae_ckpt` under the root directory and download the
[weights](https://download.openmmlab.com/mmselfsup/cae/dalle_encoder.pth) for `dalle` encoder to that folder
## Models and Benchmarks
Here, we report the results of the model, which is pre-trained on ImageNet-1k
for 300 epochs, the details are below:
| Backbone | Pre-train epoch | Fine-tuning Top-1 | Pre-train Config | Fine-tuning Config | Download |
| :------: | :-------------: | :---------------: | :-------------------------------------------------: | :---------------------------------------------------------------------------------------: | :-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
| ViT-B/16 | 300 | 83.2 | [config](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/cae/cae_vit-base-p16_8xb256-fp16-coslr-300e_in1k.py) | [config](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/vit-base-p16_ft-8xb128-coslr-100e-rpe_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/cae/cae_vit-base-p16_16xb256-coslr-300e_in1k-224_20220427-4c786349.pth) &#124; [log](https://download.openmmlab.com/mmselfsup/cae/cae_vit-base-p16_16xb256-coslr-300e_in1k-224_20220427-4c786349.log.json) |
| Backbone | Pre-train epoch | Fine-tuning Top-1 | Pre-train Config | Fine-tuning Config | Download |
| :------: | :-------------: | :---------------: | :-------------------------------------------------------------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------------------------------------------------: | :-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
| ViT-B/16 | 300 | 83.2 | [config](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/cae/cae_vit-base-p16_8xb256-fp16-coslr-300e_in1k.py) | [config](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/vit-base-p16_ft-8xb128-coslr-100e-rpe_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/cae/cae_vit-base-p16_16xb256-coslr-300e_in1k-224_20220427-4c786349.pth) \| [log](https://download.openmmlab.com/mmselfsup/cae/cae_vit-base-p16_16xb256-coslr-300e_in1k-224_20220427-4c786349.log.json) |
## Citation

View File

@ -27,18 +27,14 @@ methods that use only ImageNet-1K data. Transfer performance in downstream tasks
<img src="https://user-images.githubusercontent.com/30762564/150733959-2959852a-c7bd-4d3f-911f-3e8d8839fe67.png" width="40%"/>
</div>
## Models and Benchmarks
Here, we report the results of the model, which is pre-trained on ImageNet-1k
for 400 epochs, the details are below:
| Backbone | Pre-train epoch | Fine-tuning Top-1 | Pre-train Config | Fine-tuning Config | Download |
| :------: | :-------------: | :---------------: | :-----------------------------------------------------------------------------------------------------------------------: | :------------------------------------------------------------------------------------------------------------------------------------------------: | :-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
| ViT-B/16 | 400 | 83.1 | [config](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mae/mae_vit-b-p16_8xb512-coslr-400e_in1k.py) | [config](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/vit-base-p16_ft-8xb128-coslr-100e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/mae/mae_vit-base-p16_8xb512-coslr-400e_in1k-224_20220223-85be947b.pth) &#124; [log](https://download.openmmlab.com/mmselfsup/mae/mae_vit-base-p16_8xb512-coslr-300e_in1k-224_20220210_140925.log.json) |
| Backbone | Pre-train epoch | Fine-tuning Top-1 | Pre-train Config | Fine-tuning Config | Download |
| :------: | :-------------: | :---------------: | :-----------------------------------------------------------------------------------------------------------------------: | :------------------------------------------------------------------------------------------------------------------------------------------------: | :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
| ViT-B/16 | 400 | 83.1 | [config](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mae/mae_vit-b-p16_8xb512-coslr-400e_in1k.py) | [config](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/vit-base-p16_ft-8xb128-coslr-100e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/mae/mae_vit-base-p16_8xb512-coslr-400e_in1k-224_20220223-85be947b.pth) \| [log](https://download.openmmlab.com/mmselfsup/mae/mae_vit-base-p16_8xb512-coslr-300e_in1k-224_20220210_140925.log.json) |
## Citation

View File

@ -1,4 +1,4 @@
_base_ = 'mae_vit-base-16_8xb512-coslr-400e_in1k.py'
_base_ = 'mae_vit-base-p16_8xb512-coslr-400e_in1k.py'
# schedule
runner = dict(max_epochs=1600)

View File

@ -1,4 +1,4 @@
_base_ = 'mae_vit-base-16_8xb512-coslr-400e_in1k.py'
_base_ = 'mae_vit-base-p16_8xb512-coslr-400e_in1k.py'
# schedule
runner = dict(max_epochs=800)

View File

@ -74,9 +74,9 @@ Please refer to [faster_rcnn_r50_c4_mstrain_24k_voc0712.py](https://github.com/o
Please refer to [mask_rcnn_r50_fpn_mstrain_1x_coco.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmdetection/coco/mask_rcnn_r50_fpn_mstrain_1x_coco.py) for details of config.
| Self-Supervised Config | mAP(Box) | AP50(Box) | AP75(Box) | mAP(Mask) | AP50(Mask) | AP75(Mask) |
| -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------- | --------- | --------- | --------- | ---------- | ---------- |
| [resnet50_8xb64-steplr-70e]([relative-loc_resnet50_8xb64-steplr-70e_in1k.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/relative_loc/relative-loc_resnet50_8xb64-steplr-70e_in1k.py)) | 37.5 | 56.2 | 41.3 | 33.7 | 53.3 | 36.1 |
| Self-Supervised Config | mAP(Box) | AP50(Box) | AP75(Box) | mAP(Mask) | AP50(Mask) | AP75(Mask) |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------ | -------- | --------- | --------- | --------- | ---------- | ---------- |
| [resnet50_8xb64-steplr-70e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/relative_loc/relative-loc_resnet50_8xb64-steplr-70e_in1k.py) | 37.5 | 56.2 | 41.3 | 33.7 | 53.3 | 36.1 |
### Segmentation

View File

@ -66,17 +66,17 @@ The detection benchmarks includes 2 downstream task datasets, **Pascal VOC 2007
Please refer to [faster_rcnn_r50_c4_mstrain_24k_voc0712.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmdetection/voc0712/faster_rcnn_r50_c4_mstrain_24k_voc0712.py) for details of config.
| Self-Supervised Config | AP50 |
| ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----- |
| [resnet50_8xb16-steplr-70e]([rotation-pred_resnet50_8xb16-steplr-70e_in1k.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/rotation_pred/rotation-pred_resnet50_8xb16-steplr-70e_in1k.py)) | 79.67 |
| Self-Supervised Config | AP50 |
| -------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----- |
| [resnet50_8xb16-steplr-70e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/rotation_pred/rotation-pred_resnet50_8xb16-steplr-70e_in1k.py) | 79.67 |
#### COCO2017
Please refer to [mask_rcnn_r50_fpn_mstrain_1x_coco.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmdetection/coco/mask_rcnn_r50_fpn_mstrain_1x_coco.py) for details of config.
| Self-Supervised Config | mAP(Box) | AP50(Box) | AP75(Box) | mAP(Mask) | AP50(Mask) | AP75(Mask) |
| ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------- | --------- | --------- | --------- | ---------- | ---------- |
| [resnet50_8xb16-steplr-70e]([rotation-pred_resnet50_8xb16-steplr-70e_in1k.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/rotation_pred/rotation-pred_resnet50_8xb16-steplr-70e_in1k.py)) | 37.9 | 56.5 | 41.5 | 34.2 | 53.9 | 36.7 |
| Self-Supervised Config | mAP(Box) | AP50(Box) | AP75(Box) | mAP(Mask) | AP50(Mask) | AP75(Mask) |
| -------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------- | --------- | --------- | --------- | ---------- | ---------- |
| [resnet50_8xb16-steplr-70e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/rotation_pred/rotation-pred_resnet50_8xb16-steplr-70e_in1k.py) | 37.9 | 56.5 | 41.5 | 34.2 | 53.9 | 36.7 |
### Segmentation
@ -86,9 +86,9 @@ The segmentation benchmarks includes 2 downstream task datasets, **Cityscapes**
Please refer to [fcn_r50-d8_512x512_20k_voc12aug.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmsegmentation/voc12aug/fcn_r50-d8_512x512_20k_voc12aug.py) for details of config.
| Self-Supervised Config | mIOU |
| ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----- |
| [resnet50_8xb16-steplr-70e]([rotation-pred_resnet50_8xb16-steplr-70e_in1k.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/rotation_pred/rotation-pred_resnet50_8xb16-steplr-70e_in1k.py)) | 64.31 |
| Self-Supervised Config | mIOU |
| -------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----- |
| [resnet50_8xb16-steplr-70e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/rotation_pred/rotation-pred_resnet50_8xb16-steplr-70e_in1k.py) | 64.31 |
## Citation

View File

@ -12,18 +12,14 @@ This paper presents SimMIM, a simple framework for masked image modeling. We sim
<img src="https://user-images.githubusercontent.com/30762564/159404597-ac6d3a44-ee59-4cdc-8f6f-506a7d1b18b6.png" width="40%"/>
</div>
## Models and Benchmarks
Here, we report the results of the model, and more results will be coming soon.
| Backbone | Pre-train epoch | Pre-train resolution|Fine-tuning resolution|Fine-tuning Top-1 | Pre-train Config | Fine-tuning Config | Download |
| :------: | :-------------: | :---------------: | :---------------: |:---------------: |:-------------------------------------------------: | :---------------------------------------------------------------------------------------: | :-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
| Swin-Base | 100 | 192| 192| 82.9 | [config](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simmim/simmim_swin-base_16xb128-coslr-100e_in1k-192.py) | [config](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/swin-base_ft-8xb256-coslr-100e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/simmim/simmim_swin-base_16xb128-coslr-100e_in1k-192_20220316-1d090125.pth) &#124; [log](https://download.openmmlab.com/mmselfsup/simmim/simmim_swin-base_16xb128-coslr-100e_in1k-192_20220316-1d090125.log.json) |
| Swin-Base | 100 | 192| 224| 83.5 | [config](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simmim/simmim_swin-base_16xb128-coslr-100e_in1k-192.py) | [config](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/swin-base_ft-8xb256-coslr-100e_in1k-224.py) | [model](https://download.openmmlab.com/mmselfsup/simmim/simmim_swin-base_16xb128-coslr-100e_in1k-192_20220316-1d090125.pth) &#124; [log](https://download.openmmlab.com/mmselfsup/simmim/simmim_swin-base_16xb128-coslr-100e_in1k-192_20220316-1d090125.log.json) |
| Backbone | Pre-train epoch | Pre-train resolution | Fine-tuning resolution | Fine-tuning Top-1 | Pre-train Config | Fine-tuning Config | Download |
| :-------: | :-------------: | :------------------: | :--------------------: | :---------------: | :----------------------------------------------------------------------------------------------------------------------------------: | :-------------------------------------------------------------------------------------------------------------------------------------------------: | :-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
| Swin-Base | 100 | 192 | 192 | 82.9 | [config](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simmim/simmim_swin-base_16xb128-coslr-100e_in1k-192.py) | [config](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/swin-base_ft-8xb256-coslr-100e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/simmim/simmim_swin-base_16xb128-coslr-100e_in1k-192_20220316-1d090125.pth) \| [log](https://download.openmmlab.com/mmselfsup/simmim/simmim_swin-base_16xb128-coslr-100e_in1k-192_20220316-1d090125.log.json) |
| Swin-Base | 100 | 192 | 224 | 83.5 | [config](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simmim/simmim_swin-base_16xb128-coslr-100e_in1k-192.py) | [config](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/swin-base_ft-8xb256-coslr-100e_in1k-224.py) | [model](https://download.openmmlab.com/mmselfsup/simmim/simmim_swin-base_16xb128-coslr-100e_in1k-192_20220316-1d090125.pth) \| [log](https://download.openmmlab.com/mmselfsup/simmim/simmim_swin-base_16xb128-coslr-100e_in1k-192_20220316-1d090125.log.json) |
## Citation

View File

@ -288,7 +288,7 @@
"id": "86e6d589"
},
"source": [
"Besides, you can also use pip to install the packages, but you are supposed to check the pytorch and cuda version mannually. The example commmand is provided below, but you need to modify it according to your PyTorch and CUDA version."
"Besides, you can also use pip to install the packages, but you are supposed to check the pytorch and cuda version manually. The example command is provided below, but you need to modify it according to your PyTorch and CUDA version."
]
},
{

View File

@ -1,19 +1,99 @@
# BYOL
## Bootstrap your own latent: A new approach to self-supervised Learning
> [Bootstrap your own latent: A new approach to self-supervised Learning](https://arxiv.org/abs/2006.07733)
<!-- [ABSTRACT] -->
<!-- [ALGORITHM] -->
## Abstract
**B**ootstrap **Y**our **O**wn **L**atent (BYOL) is a new approach to self-supervised image representation learning. BYOL relies on two neural networks, referred to as online and target networks, that interact and learn from each other. From an augmented view of an image, we train the online network to predict the target network representation of the same image under a different augmented view. At the same time, we update the target network with a slow-moving average of the online network.
<!-- [IMAGE] -->
<div align="center">
<img />
<img src="https://user-images.githubusercontent.com/36138628/149720208-5ffbee78-1437-44c7-9ddb-b8caab60d2c3.png" width="800" />
</div>
## Citation
## Results and Models
<!-- [ALGORITHM] -->
**Back to [model_zoo.md](https://github.com/open-mmlab/mmselfsup/blob/master/docs/en/model_zoo.md) to download models.**
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models are pre-trained on ImageNet-1k dataset.
### Classification
The classification benchmarks includes 4 downstream task datasets, **VOC**, **ImageNet**, **iNaturalist2018** and **Places205**. If not specified, the results are Top-1 (%).
#### VOC SVM / Low-shot SVM
The **Best Layer** indicates that the best results are obtained from which layers feature map. For example, if the **Best Layer** is **feature3**, its best result is obtained from the second stage of ResNet (1 for stem layer, 2-5 for 4 stage layers).
Besides, k=1 to 96 indicates the hyper-parameter of Low-shot SVM.
| Self-Supervised Config | Best Layer | SVM | k=1 | k=2 | k=4 | k=8 | k=16 | k=32 | k=64 | k=96 |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------ | ---------- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- |
| [resnet50_8xb32-accum16-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/byol/byol_resnet50_8xb32-accum16-coslr-200e_in1k.py) | feature5 | 86.31 | 45.37 | 56.83 | 68.47 | 74.12 | 78.30 | 81.53 | 83.56 | 84.73 |
#### ImageNet Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_linear-8xb32-steplr-90e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/resnet50_mhead_linear-8xb32-steplr-90e_in1k.py) for details of config.
The **AvgPool** result is obtained from Linear Evaluation with GlobalAveragePooling. Please refer to [resnet50_linear-8xb512-coslr-90e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/resnet50_linear-8xb512-coslr-90e_in1k.py) for details of config.
| Self-Supervised Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------ | -------- | -------- | -------- | -------- | -------- | ------- |
| [resnet50_8xb32-accum16-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/byol/byol_resnet50_8xb32-accum16-coslr-200e_in1k.py) | 15.16 | 35.26 | 47.77 | 63.10 | 71.21 | 71.72 |
| [resnet50_16xb256-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/byol/byol_resnet50_16xb256-coslr-200e_in1k.py) | 15.41 | 35.15 | 47.77 | 62.59 | 71.85 | 71.88 |
#### Places205 Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_8xb32-steplr-28e_places205.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/places205/resnet50_mhead_8xb32-steplr-28e_places205.py) for details of config.
| Self-Supervised Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------ | -------- | -------- | -------- | -------- | -------- |
| [resnet50_8xb32-accum16-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/byol/byol_resnet50_8xb32-accum16-coslr-200e_in1k.py) | 21.25 | 36.55 | 43.66 | 50.74 | 53.82 |
| [resnet50_8xb32-accum16-coslr-300e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/byol/byol_resnet50_8xb32-accum16-coslr-300e_in1k.py) | 21.18 | 36.68 | 43.42 | 51.04 | 54.06 |
#### ImageNet Nearest-Neighbor Classification
The results are obtained from the features after GlobalAveragePooling. Here, k=10 to 200 indicates different number of nearest neighbors.
| Self-Supervised Config | k=10 | k=20 | k=100 | k=200 |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------ | ---- | ---- | ----- | ----- |
| [resnet50_8xb32-accum16-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/byol/byol_resnet50_8xb32-accum16-coslr-200e_in1k.py) | 63.9 | 64.2 | 62.9 | 61.9 |
| [resnet50_8xb32-accum16-coslr-300e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/byol/byol_resnet50_8xb32-accum16-coslr-300e_in1k.py) | 66.1 | 66.3 | 65.2 | 64.4 |
### Detection
The detection benchmarks includes 2 downstream task datasets, **Pascal VOC 2007 + 2012** and **COCO2017**. This benchmark follows the evluation protocols set up by MoCo.
#### Pascal VOC 2007 + 2012
Please refer to [faster_rcnn_r50_c4_mstrain_24k_voc0712.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmdetection/voc0712/faster_rcnn_r50_c4_mstrain_24k_voc0712.py) for details of config.
| Self-Supervised Config | AP50 |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------ | ----- |
| [resnet50_8xb32-accum16-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/byol/byol_resnet50_8xb32-accum16-coslr-200e_in1k.py) | 80.35 |
#### COCO2017
Please refer to [mask_rcnn_r50_fpn_mstrain_1x_coco.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmdetection/coco/mask_rcnn_r50_fpn_mstrain_1x_coco.py) for details of config.
| Self-Supervised Config | mAP(Box) | AP50(Box) | AP75(Box) | mAP(Mask) | AP50(Mask) | AP75(Mask) |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------ | -------- | --------- | --------- | --------- | ---------- | ---------- |
| [resnet50_8xb32-accum16-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/byol/byol_resnet50_8xb32-accum16-coslr-200e_in1k.py) | 40.9 | 61.0 | 44.6 | 36.8 | 58.1 | 39.5 |
### Segmentation
The segmentation benchmarks includes 2 downstream task datasets, **Cityscapes** and **Pascal VOC 2012 + Aug**. It follows the evluation protocols set up by MMSegmentation.
#### Pascal VOC 2012 + Aug
Please refer to [fcn_r50-d8_512x512_20k_voc12aug.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmsegmentation/voc12aug/fcn_r50-d8_512x512_20k_voc12aug.py) for details of config.
| Self-Supervised Config | mIOU |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------ | ----- |
| [resnet50_8xb32-accum16-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/byol/byol_resnet50_8xb32-accum16-coslr-200e_in1k.py) | 67.16 |
## Citation
```bibtex
@inproceedings{grill2020bootstrap,
@ -23,103 +103,3 @@
year={2020}
}
```
## Models and Benchmarks
[Back to model_zoo.md](../../../docs/model_zoo.md)
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models were trained on ImageNet1k dataset.
### VOC SVM / Low-shot SVM
The **Best Layer** indicates that the best results are obtained from which layers feature map. For example, if the **Best Layer** is **feature3**, its best result is obtained from the second stage of ResNet (1 for stem layer, 2-5 for 4 stage layers).
Besides, k=1 to 96 indicates the hyper-parameter of Low-shot SVM.
| Model | Config setting | Best Layer | SVM | k=1 | k=2 | k=4 | k=8 | k=16 | k=32 | k=64 | k=96 |
| --------- | ----------------------------------------------------------------------------------- | ---------- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- |
| [model]() | [resnet50_8xb32-accum16-coslr-200e](byol_resnet50_8xb32-accum16-coslr-200e_in1k.py) | feature3 | 86.31 | 45.37 | 56.83 | 68.47 | 74.12 | 78.30 | 81.53 | 83.56 | 84.73 |
### Classification
The classification benchmarks includes 3 downstream task datasets, **ImageNet**, **iNaturalist2018** and **Places205**. If not specified, the results are Top-1 (%).
#### ImageNet Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_8xb32-steplr-90e.py](../../benchmarks/classification/imagenet/resnet50_mhead_8xb32-steplr-90e_in1k.py) for details of config.
The **AvgPool** result is obtained from Linear Evaluation with GlobalAveragePooling. Please refer to [{file name}]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | ----------------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb32-accum16-coslr-200e](byol_resnet50_8xb32-accum16-coslr-200e_in1k.py) | | | | | | 67.68 |
#### iNaturalist2018 Linear Evaluation
Please refer to [resnet50_mhead_8xb32-steplr-84e_inat18.py](../../benchmarks/classification/inaturalist2018/resnet50_mhead_8xb32-steplr-84e_inat18.py) and [{file name}]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | ----------------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb32-accum16-coslr-200e](byol_resnet50_8xb32-accum16-coslr-200e_in1k.py) | | | | | | |
#### Places205 Linear Evaluation
Please refer to [resnet50_mhead_8xb32-steplr-28e_places205.py](../../benchmarks/classification/inaturalist2018/resnet50_mhead_8xb32-steplr-28e_places205.py) and [{file name}]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | ----------------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb32-accum16-coslr-200e](byol_resnet50_8xb32-accum16-coslr-200e_in1k.py) | | | | | | |
#### Semi-Supervised Classification
- In this benchmark, the necks or heads are removed and only the backbone CNN is evaluated by appending a linear classification head. All parameters are fine-tuned.
- When training with 1% ImageNet, we find hyper-parameters especially the learning rate greatly influence the performance. Hence, we prepare a list of settings with the base learning rate from `{0.001, 0.01, 0.1}` and the learning rate multiplier for the head from `{1, 10, 100}`. We choose the best performing setting for each method. The setting of parameters are indicated in the file name. The learning rate is indicated like `1e-1`, `1e-2`, `1e-3` and the learning rate multiplier is indicated like `head1`, `head10`, `head100`.
- Please use --deterministic in this benchmark.
Please refer to the directories `configs/benchmarks/classification/imagenet/imagenet_1percent/` of 1% data and `configs/benchmarks/classification/imagenet/imagenet_10percent/` 10% data for details.
| Model | Pretrain Config | Fine-tuned Config | Top-1 (%) | Top-5 (%) |
| --------- | ----------------------------------------------------------------------------------- | ----------------- | --------- | --------- |
| [model]() | [resnet50_8xb32-accum16-coslr-200e](byol_resnet50_8xb32-accum16-coslr-200e_in1k.py) | | | |
### Detection
The detection benchmarks includes 2 downstream task datasets, **Pascal VOC 2007 + 2012** and **COCO2017**. This benchmark follows the evluation protocols set up by MoCo.
#### Pascal VOC 2007 + 2012
Please refer to [faster_rcnn_r50_c4_mstrain_24k.py](../../benchmarks/mmdetection/voc0712/faster_rcnn_r50_c4_mstrain_24k.py) for details of config.
| Model | Config | mAP | AP50 |
| --------- | ----------------------------------------------------------------------------------- | --- | ---- |
| [model]() | [resnet50_8xb32-accum16-coslr-200e](byol_resnet50_8xb32-accum16-coslr-200e_in1k.py) | | |
#### COCO2017
Please refer to [mask_rcnn_r50_fpn_mstrain_1x.py](../../benchmarks/mmdetection/coco/mask_rcnn_r50_fpn_mstrain_1x.py) for details of config.
| Model | Config | mAP(Box) | AP50(Box) | AP75(Box) | mAP(Mask) | AP50(Mask) | AP75(Mask) |
| --------- | ----------------------------------------------------------------------------------- | -------- | --------- | --------- | --------- | ---------- | ---------- |
| [model]() | [resnet50_8xb32-accum16-coslr-200e](byol_resnet50_8xb32-accum16-coslr-200e_in1k.py) | | | | | | |
### Segmentation
The segmentation benchmarks includes 2 downstream task datasets, **Cityscapes** and **Pascal VOC 2012 + Aug**. It follows the evluation protocols set up by MMSegmentation.
#### Pascal VOC 2012 + Aug
Please refer to [{file name}]() for details of config.
| Model | Config | mIOU |
| --------- | ----------------------------------------------------------------------------------- | ---- |
| [model]() | [resnet50_8xb32-accum16-coslr-200e](byol_resnet50_8xb32-accum16-coslr-200e_in1k.py) | |
#### Cityscapes
Please refer to [{file name}]() for details of config.
| Model | Config | mIOU |
| --------- | ----------------------------------------------------------------------------------- | ---- |
| [model]() | [resnet50_8xb32-accum16-coslr-200e](byol_resnet50_8xb32-accum16-coslr-200e_in1k.py) | |

View File

@ -12,22 +12,19 @@ We present a novel masked image modeling (MIM) approach, context autoencoder (CA
<img src="https://user-images.githubusercontent.com/30762564/165459947-6c6ef13c-0593-4765-b44e-6da0a079802a.png" width="40%"/>
</div>
## Prerequisite
Create a new folder ``cae_ckpt`` under the root directory and download the
[weights](https://download.openmmlab.com/mmselfsup/cae/dalle_encoder.pth) for ``dalle`` encoder to that folder
Create a new folder `cae_ckpt` under the root directory and download the
[weights](https://download.openmmlab.com/mmselfsup/cae/dalle_encoder.pth) for `dalle` encoder to that folder
## Models and Benchmarks
Here, we report the results of the model, which is pre-trained on ImageNet-1k
for 300 epochs, the details are below:
| Backbone | Pre-train epoch | Fine-tuning Top-1 | Pre-train Config | Fine-tuning Config | Download |
| :------: | :-------------: | :---------------: | :-------------------------------------------------: | :---------------------------------------------------------------------------------------: | :-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
| ViT-B/16 | 300 | 83.2 | [config](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/cae/cae_vit-base-p16_8xb256-fp16-coslr-300e_in1k.py) | [config](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/vit-base-p16_ft-8xb128-coslr-100e-rpe_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/cae/cae_vit-base-p16_16xb256-coslr-300e_in1k-224_20220427-4c786349.pth) &#124; [log](https://download.openmmlab.com/mmselfsup/cae/cae_vit-base-p16_16xb256-coslr-300e_in1k-224_20220427-4c786349.log.json) |
| Backbone | Pre-train epoch | Fine-tuning Top-1 | Pre-train Config | Fine-tuning Config | Download |
| :------: | :-------------: | :---------------: | :-------------------------------------------------------------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------------------------------------------------: | :-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
| ViT-B/16 | 300 | 83.2 | [config](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/cae/cae_vit-base-p16_8xb256-fp16-coslr-300e_in1k.py) | [config](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/vit-base-p16_ft-8xb128-coslr-100e-rpe_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/cae/cae_vit-base-p16_16xb256-coslr-300e_in1k-224_20220427-4c786349.pth) \| [log](https://download.openmmlab.com/mmselfsup/cae/cae_vit-base-p16_16xb256-coslr-300e_in1k-224_20220427-4c786349.log.json) |
## Citation

View File

@ -1,19 +1,56 @@
# DeepCluster
## Deep Clustering for Unsupervised Learning of Visual Features
> [Deep Clustering for Unsupervised Learning of Visual Features](https://arxiv.org/abs/1807.05520)
<!-- [ABSTRACT] -->
<!-- [ALGORITHM] -->
## Abstract
Clustering is a class of unsupervised learning methods that has been extensively applied and studied in computer vision. Little work has been done to adapt it to the end-to-end training of visual features on large scale datasets. In this work, we present DeepCluster, a clustering method that jointly learns the parameters of a neural network and the cluster assignments of the resulting features. DeepCluster iteratively groups the features with a standard clustering algorithm, k-means, and uses the subsequent assignments as supervision to update the weights of the network.
<!-- [IMAGE] -->
<div align="center">
<img />
<img src="https://user-images.githubusercontent.com/36138628/149720586-5bfd213e-0638-47fc-b48a-a16689190e17.png" width="700" />
</div>
## Citation
## Results and Models
<!-- [ALGORITHM] -->
**Back to [model_zoo.md](https://github.com/open-mmlab/mmselfsup/blob/master/docs/en/model_zoo.md) to download models.**
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models are pre-trained on ImageNet-1k dataset.
### Classification
The classification benchmarks includes 4 downstream task datasets, **VOC**, **ImageNet**, **iNaturalist2018** and **Places205**. If not specified, the results are Top-1 (%).
#### VOC SVM / Low-shot SVM
The **Best Layer** indicates that the best results are obtained from which layers feature map. For example, if the **Best Layer** is **feature3**, its best result is obtained from the second stage of ResNet (1 for stem layer, 2-5 for 4 stage layers).
Besides, k=1 to 96 indicates the hyper-parameter of Low-shot SVM.
| Self-Supervised Config | Best Layer | SVM | k=1 | k=2 | k=4 | k=8 | k=16 | k=32 | k=64 | k=96 |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ---------- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- |
| [sobel_resnet50_8xb64-steplr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/deepcluster/deepcluster-sobel_resnet50_8xb64-steplr-200e_in1k.py) | feature5 | 74.26 | 29.37 | 37.99 | 45.85 | 55.57 | 62.48 | 66.15 | 70.00 | 71.37 |
#### ImageNet Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_linear-8xb32-steplr-90e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/resnet50_mhead_linear-8xb32-steplr-90e_in1k.py) for details of config.
The **AvgPool** result is obtained from Linear Evaluation with GlobalAveragePooling. Please refer to [resnet50_linear-8xb32-steplr-100e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/resnet50_linear-8xb32-steplr-100e_in1k.py) for details of config.
| Self-Supervised Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -------- | -------- | -------- | -------- | -------- | ------- |
| [sobel_resnet50_8xb64-steplr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/deepcluster/deepcluster-sobel_resnet50_8xb64-steplr-200e_in1k.py) | 12.78 | 30.81 | 43.88 | 57.71 | 51.68 | 46.92 |
#### Places205 Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_8xb32-steplr-28e_places205.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/places205/resnet50_mhead_8xb32-steplr-28e_places205.py) for details of config.
| Self-Supervised Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -------- | -------- | -------- | -------- | -------- |
| [sobel_resnet50_8xb64-steplr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/deepcluster/deepcluster-sobel_resnet50_8xb64-steplr-200e_in1k.py) | 18.80 | 33.93 | 41.44 | 47.22 | 42.61 |
## Citation
```bibtex
@inproceedings{caron2018deep,
@ -23,99 +60,3 @@ Clustering is a class of unsupervised learning methods that has been extensively
year={2018}
}
```
## Models and Benchmarks
[Back to model_zoo.md](../../../docs/model_zoo.md)
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models were trained on ImageNet1k dataset.
### VOC SVM / Low-shot SVM
The **Best Layer** indicates that the best results are obtained from which layers feature map. For example, if the **Best Layer** is **feature3**, its best result is obtained from the second stage of ResNet (1 for stem layer, 2-5 for 4 stage layers).
Besides, k=1 to 96 indicates the hyper-parameter of Low-shot SVM.
| Model | Config | Best Layer | SVM | k=1 | k=2 | k=4 | k=8 | k=16 | k=32 | k=64 | k=96 |
| --------- | ---------------------------------------------------------------------------- | ---------- | --- | --- | --- | --- | --- | ---- | ---- | ---- | ---- |
| [model]() | [resnet50_8xb64-steplr-200e](deepcluster_resnet50_8xb64-steplr-200e_in1k.py) | | | | | | | | | | |
### ImageNet Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_8xb32-steplr-90e.py](../../benchmarks/classification/imagenet/resnet50_mhead_8xb32-steplr-90e_in1k.py) for details of config.
The **AvgPool** result is obtained from Linear Evaluation with GlobalAveragePooling. Please refer to [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | ---------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb64-steplr-200e](deepcluster_resnet50_8xb64-steplr-200e_in1k.py) | | | | | | |
### iNaturalist2018 Linear Evaluation
Please refer to [resnet50_mhead_8xb32-steplr-84e_inat18.py](../../benchmarks/classification/inaturalist2018/resnet50_mhead_8xb32-steplr-84e_inat18.py) and [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | ---------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb64-steplr-200e](deepcluster_resnet50_8xb64-steplr-200e_in1k.py) | | | | | | |
### Places205 Linear Evaluation
Please refer to [resnet50_mhead_8xb32-steplr-28e_places205.py](../../benchmarks/classification/inaturalist2018/resnet50_mhead_8xb32-steplr-28e_places205.py) and [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | ---------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb64-steplr-200e](deepcluster_resnet50_8xb64-steplr-200e_in1k.py) | | | | | | |
#### Semi-Supervised Classification
- In this benchmark, the necks or heads are removed and only the backbone CNN is evaluated by appending a linear classification head. All parameters are fine-tuned.
- When training with 1% ImageNet, we find hyper-parameters especially the learning rate greatly influence the performance. Hence, we prepare a list of settings with the base learning rate from `{0.001, 0.01, 0.1}` and the learning rate multiplier for the head from `{1, 10, 100}`. We choose the best performing setting for each method. The setting of parameters are indicated in the file name. The learning rate is indicated like `1e-1`, `1e-2`, `1e-3` and the learning rate multiplier is indicated like `head1`, `head10`, `head100`.
- Please use --deterministic in this benchmark.
Please refer to the directories `configs/benchmarks/classification/imagenet/imagenet_1percent/` of 1% data and `configs/benchmarks/classification/imagenet/imagenet_10percent/` 10% data for details.
| Model | Pretrain Config | Fine-tuned Config | Top-1 (%) | Top-5 (%) |
| --------- | ---------------------------------------------------------------------------- | ----------------- | --------- | --------- |
| [model]() | [resnet50_8xb64-steplr-200e](deepcluster_resnet50_8xb64-steplr-200e_in1k.py) | | | |
### Detection
The detection benchmarks includes 2 downstream task datasets, **Pascal VOC 2007 + 2012** and **COCO2017**. This benchmark follows the evluation protocols set up by MoCo.
#### Pascal VOC 2007 + 2012
Please refer to [faster_rcnn_r50_c4_mstrain_24k.py](../../benchmarks/mmdetection/voc0712/faster_rcnn_r50_c4_mstrain_24k.py) for details of config.
| Model | Config | mAP | AP50 |
| --------- | ---------------------------------------------------------------------------- | --- | ---- |
| [model]() | [resnet50_8xb64-steplr-200e](deepcluster_resnet50_8xb64-steplr-200e_in1k.py) | | |
#### COCO2017
Please refer to [mask_rcnn_r50_fpn_mstrain_1x.py](../../benchmarks/mmdetection/coco/mask_rcnn_r50_fpn_mstrain_1x.py) for details of config.
| Model | Config | mAP(Box) | AP50(Box) | AP75(Box) | mAP(Mask) | AP50(Mask) | AP75(Mask) |
| --------- | ---------------------------------------------------------------------------- | -------- | --------- | --------- | --------- | ---------- | ---------- |
| [model]() | [resnet50_8xb64-steplr-200e](deepcluster_resnet50_8xb64-steplr-200e_in1k.py) | | | | | | |
### Segmentation
The segmentation benchmarks includes 2 downstream task datasets, **Cityscapes** and **Pascal VOC 2012 + Aug**. It follows the evluation protocols set up by MMSegmentation.
#### Pascal VOC 2012 + Aug
Please refer to [file]() for details of config.
| Model | Config | mIOU |
| --------- | ---------------------------------------------------------------------------- | ---- |
| [model]() | [resnet50_8xb64-steplr-200e](deepcluster_resnet50_8xb64-steplr-200e_in1k.py) | |
#### Cityscapes
Please refer to [file]() for details of config.
| Model | Config | mIOU |
| --------- | ---------------------------------------------------------------------------- | ---- |
| [model]() | [resnet50_8xb64-steplr-200e](deepcluster_resnet50_8xb64-steplr-200e_in1k.py) | |

View File

@ -1,19 +1,96 @@
# DenseCL
## Dense Contrastive Learning for Self-Supervised Visual Pre-Training
> [Dense Contrastive Learning for Self-Supervised Visual Pre-Training](https://arxiv.org/abs/2011.09157)
<!-- [ABSTRACT] -->
<!-- [ALGORITHM] -->
## Abstract
To date, most existing self-supervised learning methods are designed and optimized for image classification. These pre-trained models can be sub-optimal for dense prediction tasks due to the discrepancy between image-level prediction and pixel-level prediction. To fill this gap, we aim to design an effective, dense self-supervised learning method that directly works at the level of pixels (or local features) by taking into account the correspondence between local features. We present dense contrastive learning (DenseCL), which implements self-supervised learning by optimizing a pairwise contrastive (dis)similarity loss at the pixel level between two views of input images.
<!-- [IMAGE] -->
<div align="center">
<img />
<img src="https://user-images.githubusercontent.com/36138628/149721111-bab03a6d-a30d-418e-b338-43c3689cfc65.png" width="900" />
</div>
## Citation
## Results and Models
<!-- [ALGORITHM] -->
**Back to [model_zoo.md](https://github.com/open-mmlab/mmselfsup/blob/master/docs/en/model_zoo.md) to download models.**
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models are pre-trained on ImageNet-1k dataset.
### Classification
The classification benchmarks includes 4 downstream task datasets, **VOC**, **ImageNet**, **iNaturalist2018** and **Places205**. If not specified, the results are Top-1 (%).
#### VOC SVM / Low-shot SVM
The **Best Layer** indicates that the best results are obtained from which layers feature map. For example, if the **Best Layer** is **feature3**, its best result is obtained from the second stage of ResNet (1 for stem layer, 2-5 for 4 stage layers).
Besides, k=1 to 96 indicates the hyper-parameter of Low-shot SVM.
| Self-Supervised Config | Best Layer | SVM | k=1 | k=2 | k=4 | k=8 | k=16 | k=32 | k=64 | k=96 |
| -------------------------------------------------------------------------------------------------------------------------------------------------- | ---------- | ---- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/densecl/densecl_resnet50_8xb32-coslr-200e_in1k.py) | feature5 | 82.5 | 42.68 | 50.64 | 61.74 | 68.17 | 72.99 | 76.07 | 79.19 | 80.55 |
#### ImageNet Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_linear-8xb32-steplr-90e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/resnet50_mhead_linear-8xb32-steplr-90e_in1k.py) for details of config.
The **AvgPool** result is obtained from Linear Evaluation with GlobalAveragePooling. Please refer to [resnet50_linear-8xb32-steplr-100e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/resnet50_linear-8xb32-steplr-100e_in1k.py) for details of config.
| Self-Supervised Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| -------------------------------------------------------------------------------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/densecl/densecl_resnet50_8xb32-coslr-200e_in1k.py) | 15.86 | 35.47 | 49.46 | 64.06 | 62.95 | 63.34 |
#### Places205 Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_8xb32-steplr-28e_places205.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/places205/resnet50_mhead_8xb32-steplr-28e_places205.py) for details of config.
| Self-Supervised Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 |
| -------------------------------------------------------------------------------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/densecl/densecl_resnet50_8xb32-coslr-200e_in1k.py) | 21.32 | 36.20 | 43.97 | 51.04 | 50.45 |
#### ImageNet Nearest-Neighbor Classification
The results are obtained from the features after GlobalAveragePooling. Here, k=10 to 200 indicates different number of nearest neighbors.
| Self-Supervised Config | k=10 | k=20 | k=100 | k=200 |
| -------------------------------------------------------------------------------------------------------------------------------------------------- | ---- | ---- | ----- | ----- |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/densecl/densecl_resnet50_8xb32-coslr-200e_in1k.py) | 48.2 | 48.5 | 46.8 | 45.6 |
### Detection
The detection benchmarks includes 2 downstream task datasets, **Pascal VOC 2007 + 2012** and **COCO2017**. This benchmark follows the evluation protocols set up by MoCo.
#### Pascal VOC 2007 + 2012
Please refer to [faster_rcnn_r50_c4_mstrain_24k_voc0712.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmdetection/voc0712/faster_rcnn_r50_c4_mstrain_24k_voc0712.py) for details of config.
| Self-Supervised Config | AP50 |
| -------------------------------------------------------------------------------------------------------------------------------------------------- | ----- |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/densecl/densecl_resnet50_8xb32-coslr-200e_in1k.py) | 82.14 |
#### COCO2017
Please refer to [mask_rcnn_r50_fpn_mstrain_1x_coco.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmdetection/coco/mask_rcnn_r50_fpn_mstrain_1x_coco.py) for details of config.
| Self-Supervised Config | mAP(Box) | AP50(Box) | AP75(Box) | mAP(Mask) | AP50(Mask) | AP75(Mask) |
| -------------------------------------------------------------------------------------------------------------------------------------------------- | -------- | --------- | --------- | --------- | ---------- | ---------- |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/densecl/densecl_resnet50_8xb32-coslr-200e_in1k.py) | | | | | | |
### Segmentation
The segmentation benchmarks includes 2 downstream task datasets, **Cityscapes** and **Pascal VOC 2012 + Aug**. It follows the evluation protocols set up by MMSegmentation.
#### Pascal VOC 2012 + Aug
Please refer to [fcn_r50-d8_512x512_20k_voc12aug.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmsegmentation/voc12aug/fcn_r50-d8_512x512_20k_voc12aug.py) for details of config.
| Self-Supervised Config | mIOU |
| -------------------------------------------------------------------------------------------------------------------------------------------------- | ----- |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/densecl/densecl_resnet50_8xb32-coslr-200e_in1k.py) | 69.47 |
## Citation
```bibtex
@inproceedings{wang2021dense,
@ -23,99 +100,3 @@ To date, most existing self-supervised learning methods are designed and optimiz
year={2021}
}
```
## Models and Benchmarks
[Back to model_zoo.md](../../../docs/model_zoo.md)
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models were trained on ImageNet1k dataset.
### VOC SVM / Low-shot SVM
The **Best Layer** indicates that the best results are obtained from which layers feature map. For example, if the **Best Layer** is **feature3**, its best result is obtained from the second stage of ResNet (1 for stem layer, 2-5 for 4 stage layers).
Besides, k=1 to 96 indicates the hyper-parameter of Low-shot SVM.
| Model | Config | Best Layer | SVM | k=1 | k=2 | k=4 | k=8 | k=16 | k=32 | k=64 | k=96 |
| --------- | ---------------------------------------------------------------------- | ---------- | --- | --- | --- | --- | --- | ---- | ---- | ---- | ---- |
| [model]() | [resnet50_8xb32-coslr-200e](densecl_resnet50_8xb32-coslr-200e_in1k.py) | | | | | | | | | | |
### ImageNet Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_8xb32-steplr-90e.py](../../benchmarks/classification/imagenet/resnet50_mhead_8xb32-steplr-90e_in1k.py) for details of config.
The **AvgPool** result is obtained from Linear Evaluation with GlobalAveragePooling. Please refer to [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | ---------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb32-coslr-200e](densecl_resnet50_8xb32-coslr-200e_in1k.py) | | | | | | |
### iNaturalist2018 Linear Evaluation
Please refer to [resnet50_mhead_8xb32-steplr-84e_inat18.py](../../benchmarks/classification/inaturalist2018/resnet50_mhead_8xb32-steplr-84e_inat18.py) and [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | ---------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb32-coslr-200e](densecl_resnet50_8xb32-coslr-200e_in1k.py) | | | | | | |
### Places205 Linear Evaluation
Please refer to [resnet50_mhead_8xb32-steplr-28e_places205.py](../../benchmarks/classification/inaturalist2018/resnet50_mhead_8xb32-steplr-28e_places205.py) and [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | ---------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb32-coslr-200e](densecl_resnet50_8xb32-coslr-200e_in1k.py) | | | | | | |
#### Semi-Supervised Classification
- In this benchmark, the necks or heads are removed and only the backbone CNN is evaluated by appending a linear classification head. All parameters are fine-tuned.
- When training with 1% ImageNet, we find hyper-parameters especially the learning rate greatly influence the performance. Hence, we prepare a list of settings with the base learning rate from `{0.001, 0.01, 0.1}` and the learning rate multiplier for the head from `{1, 10, 100}`. We choose the best performing setting for each method. The setting of parameters are indicated in the file name. The learning rate is indicated like `1e-1`, `1e-2`, `1e-3` and the learning rate multiplier is indicated like `head1`, `head10`, `head100`.
- Please use --deterministic in this benchmark.
Please refer to the directories `configs/benchmarks/classification/imagenet/imagenet_1percent/` of 1% data and `configs/benchmarks/classification/imagenet/imagenet_10percent/` 10% data for details.
| Model | Pretrain Config | Fine-tuned Config | Top-1 (%) | Top-5 (%) |
| --------- | ---------------------------------------------------------------------- | ----------------- | --------- | --------- |
| [model]() | [resnet50_8xb32-coslr-200e](densecl_resnet50_8xb32-coslr-200e_in1k.py) | | | |
### Detection
The detection benchmarks includes 2 downstream task datasets, **Pascal VOC 2007 + 2012** and **COCO2017**. This benchmark follows the evluation protocols set up by MoCo.
#### Pascal VOC 2007 + 2012
Please refer to [faster_rcnn_r50_c4_mstrain_24k.py](../../benchmarks/mmdetection/voc0712/faster_rcnn_r50_c4_mstrain_24k.py) for details of config.
| Model | Config | mAP | AP50 |
| --------- | ---------------------------------------------------------------------- | --- | ---- |
| [model]() | [resnet50_8xb32-coslr-200e](densecl_resnet50_8xb32-coslr-200e_in1k.py) | | |
#### COCO2017
Please refer to [mask_rcnn_r50_fpn_mstrain_1x.py](../../benchmarks/mmdetection/coco/mask_rcnn_r50_fpn_mstrain_1x.py) for details of config.
| Model | Config | mAP(Box) | AP50(Box) | AP75(Box) | mAP(Mask) | AP50(Mask) | AP75(Mask) |
| --------- | ---------------------------------------------------------------------- | -------- | --------- | --------- | --------- | ---------- | ---------- |
| [model]() | [resnet50_8xb32-coslr-200e](densecl_resnet50_8xb32-coslr-200e_in1k.py) | | | | | | |
### Segmentation
The segmentation benchmarks includes 2 downstream task datasets, **Cityscapes** and **Pascal VOC 2012 + Aug**. It follows the evluation protocols set up by MMSegmentation.
#### Pascal VOC 2012 + Aug
Please refer to [file]() for details of config.
| Model | Config | mIOU |
| --------- | ---------------------------------------------------------------------- | ---- |
| [model]() | [resnet50_8xb32-coslr-200e](densecl_resnet50_8xb32-coslr-200e_in1k.py) | |
#### Cityscapes
Please refer to [file]() for details of config.
| Model | Config | mIOU |
| --------- | ---------------------------------------------------------------------- | ---- |
| [model]() | [resnet50_8xb32-coslr-200e](densecl_resnet50_8xb32-coslr-200e_in1k.py) | |

View File

@ -27,18 +27,14 @@ methods that use only ImageNet-1K data. Transfer performance in downstream tasks
<img src="https://user-images.githubusercontent.com/30762564/150733959-2959852a-c7bd-4d3f-911f-3e8d8839fe67.png" width="40%"/>
</div>
## Models and Benchmarks
Here, we report the results of the model, which is pre-trained on ImageNet1K
for 400 epochs, the details are below:
| Backbone | Pre-train epoch | Fine-tuning Top-1 | Pre-train Config | Fine-tuning Config | Download |
| :------: | :-------------: | :---------------: | :-------------------------------------------------: | :---------------------------------------------------------------------------------------: | :-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
| ViT-B/16 | 400 | 83.1 | [config](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mae/mae_vit-b-p16_8xb512-coslr-400e_in1k.py) | [config](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/vit-b-p16_ft-8xb128-coslr-100e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/mae/mae_vit-base-p16_8xb512-coslr-400e_in1k-224_20220223-85be947b.pth) &#124; [log](https://download.openmmlab.com/mmselfsup/mae/mae_vit-base-p16_8xb512-coslr-300e_in1k-224_20220210_140925.log.json) |
| Backbone | Pre-train epoch | Fine-tuning Top-1 | Pre-train Config | Fine-tuning Config | Download |
| :------: | :-------------: | :---------------: | :-----------------------------------------------------------------------------------------------------------------------: | :---------------------------------------------------------------------------------------------------------------------------------------------: | :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
| ViT-B/16 | 400 | 83.1 | [config](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mae/mae_vit-b-p16_8xb512-coslr-400e_in1k.py) | [config](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/vit-b-p16_ft-8xb128-coslr-100e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/mae/mae_vit-base-p16_8xb512-coslr-400e_in1k-224_20220223-85be947b.pth) \| [log](https://download.openmmlab.com/mmselfsup/mae/mae_vit-base-p16_8xb512-coslr-300e_in1k-224_20220210_140925.log.json) |
## Citation

View File

@ -1,43 +1,96 @@
# MoCo v1 / v2
# MoCo v2
## Momentum Contrast for Unsupervised Visual Representation Learning (MoCo v1)
<!-- [ABSTRACT] -->
We present Momentum Contrast (MoCo) for unsupervised visual representation learning. From a perspective on contrastive learning as dictionary look-up, we build a dynamic dictionary with a queue and a moving-averaged encoder. This enables building a large and consistent dictionary on-the-fly that facilitates contrastive unsupervised learning. MoCo provides competitive results under the common linear protocol on ImageNet classification. More importantly, the representations learned by MoCo transfer well to downstream tasks.
<!-- [IMAGE] -->
<div align="center">
<img />
</div>
## Citation
> [Improved Baselines with Momentum Contrastive Learning](https://arxiv.org/abs/2003.04297)
<!-- [ALGORITHM] -->
```bibtex
@inproceedings{he2020momentum,
title={Momentum contrast for unsupervised visual representation learning},
author={He, Kaiming and Fan, Haoqi and Wu, Yuxin and Xie, Saining and Girshick, Ross},
booktitle={CVPR},
year={2020}
}
```
## Improved Baselines with Momentum Contrastive Learning (MoCo v2)
<!-- [ABSTRACT] -->
## Abstract
Contrastive unsupervised learning has recently shown encouraging progress, e.g., in Momentum Contrast (MoCo) and SimCLR. In this note, we verify the effectiveness of two of SimCLRs design improvements by implementing them in the MoCo framework. With simple modifications to MoCo—namely, using an MLP projection head and more data augmentation—we establish stronger baselines that outperform SimCLR and do not require large training batches. We hope this will make state-of-the-art unsupervised learning research more accessible.
<!-- [IMAGE] -->
<div align="center">
<img />
<img src="https://user-images.githubusercontent.com/36138628/149720067-b65e5736-d425-45b3-93ed-6f2427fc6217.png" width="500" />
</div>
## Citation
## Results and Models
<!-- [ALGORITHM] -->
**Back to [model_zoo.md](https://github.com/open-mmlab/mmselfsup/blob/master/docs/en/model_zoo.md) to download models.**
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models are pre-trained on ImageNet-1k dataset.
### Classification
The classification benchmarks includes 4 downstream task datasets, **VOC**, **ImageNet**, **iNaturalist2018** and **Places205**. If not specified, the results are Top-1 (%).
#### VOC SVM / Low-shot SVM
The **Best Layer** indicates that the best results are obtained from which layers feature map. For example, if the **Best Layer** is **feature3**, its best result is obtained from the second stage of ResNet (1 for stem layer, 2-5 for 4 stage layers).
Besides, k=1 to 96 indicates the hyper-parameter of Low-shot SVM.
| Self-Supervised Config | Best Layer | SVM | k=1 | k=2 | k=4 | k=8 | k=16 | k=32 | k=64 | k=96 |
| ------------------------------------------------------------------------------------------------------------------------------------------------ | ---------- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mocov2/mocov2_resnet50_8xb32-coslr-200e_in1k.py) | feature5 | 84.04 | 43.14 | 53.29 | 65.34 | 71.03 | 75.42 | 78.48 | 80.88 | 82.23 |
#### ImageNet Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_linear-8xb32-steplr-90e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/resnet50_mhead_linear-8xb32-steplr-90e_in1k.py) for details of config.
The **AvgPool** result is obtained from Linear Evaluation with GlobalAveragePooling. Please refer to [resnet50_linear-8xb32-steplr-100e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/resnet50_linear-8xb32-steplr-100e_in1k.py) for details of config.
| Self-Supervised Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| ------------------------------------------------------------------------------------------------------------------------------------------------ | -------- | -------- | -------- | -------- | -------- | ------- |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mocov2/mocov2_resnet50_8xb32-coslr-200e_in1k.py) | 15.96 | 34.22 | 45.78 | 61.11 | 66.24 | 67.58 |
#### Places205 Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_8xb32-steplr-28e_places205.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/places205/resnet50_mhead_8xb32-steplr-28e_places205.py) for details of config.
| Self-Supervised Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 |
| ------------------------------------------------------------------------------------------------------------------------------------------------ | -------- | -------- | -------- | -------- | -------- |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mocov2/mocov2_resnet50_8xb32-coslr-200e_in1k.py) | 20.92 | 35.72 | 42.62 | 49.79 | 52.25 |
#### ImageNet Nearest-Neighbor Classification
The results are obtained from the features after GlobalAveragePooling. Here, k=10 to 200 indicates different number of nearest neighbors.
| Self-Supervised Config | k=10 | k=20 | k=100 | k=200 |
| ------------------------------------------------------------------------------------------------------------------------------------------------ | ---- | ---- | ----- | ----- |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mocov2/mocov2_resnet50_8xb32-coslr-200e_in1k.py) | 55.6 | 55.7 | 53.7 | 52.5 |
### Detection
The detection benchmarks includes 2 downstream task datasets, **Pascal VOC 2007 + 2012** and **COCO2017**. This benchmark follows the evluation protocols set up by MoCo.
#### Pascal VOC 2007 + 2012
Please refer to [faster_rcnn_r50_c4_mstrain_24k_voc0712.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmdetection/voc0712/faster_rcnn_r50_c4_mstrain_24k_voc0712.py) for details of config.
| Self-Supervised Config | AP50 |
| ------------------------------------------------------------------------------------------------------------------------------------------------ | ----- |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mocov2/mocov2_resnet50_8xb32-coslr-200e_in1k.py) | 81.06 |
#### COCO2017
Please refer to [mask_rcnn_r50_fpn_mstrain_1x_coco.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmdetection/coco/mask_rcnn_r50_fpn_mstrain_1x_coco.py) for details of config.
| Self-Supervised Config | mAP(Box) | AP50(Box) | AP75(Box) | mAP(Mask) | AP50(Mask) | AP75(Mask) |
| ------------------------------------------------------------------------------------------------------------------------------------------------ | -------- | --------- | --------- | --------- | ---------- | ---------- |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mocov2/mocov2_resnet50_8xb32-coslr-200e_in1k.py) | 40.2 | 59.7 | 44.2 | 36.1 | 56.7 | 38.8 |
### Segmentation
The segmentation benchmarks includes 2 downstream task datasets, **Cityscapes** and **Pascal VOC 2012 + Aug**. It follows the evluation protocols set up by MMSegmentation.
#### Pascal VOC 2012 + Aug
Please refer to [fcn_r50-d8_512x512_20k_voc12aug.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmsegmentation/voc12aug/fcn_r50-d8_512x512_20k_voc12aug.py) for details of config.
| Self-Supervised Config | mIOU |
| ------------------------------------------------------------------------------------------------------------------------------------------------ | ----- |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mocov2/mocov2_resnet50_8xb32-coslr-200e_in1k.py) | 67.55 |
## Citation
```bibtex
@article{chen2020improved,
@ -47,99 +100,3 @@ Contrastive unsupervised learning has recently shown encouraging progress, e.g.,
year={2020}
}
```
## Models and Benchmarks
[Back to model_zoo.md](../../../docs/model_zoo.md)
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models were trained on ImageNet1k dataset.
### VOC SVM / Low-shot SVM
The **Best Layer** indicates that the best results are obtained from which layers feature map. For example, if the **Best Layer** is **feature3**, its best result is obtained from the second stage of ResNet (1 for stem layer, 2-5 for 4 stage layers).
Besides, k=1 to 96 indicates the hyper-parameter of Low-shot SVM.
| Model | Config | Best Layer | SVM | k=1 | k=2 | k=4 | k=8 | k=16 | k=32 | k=64 | k=96 |
| --------- | ---------------------------------------------------------------------------- | ---------- | ----- | --- | --- | --- | --- | ---- | ---- | ---- | ---- |
| [model]() | [mocov2_resnet50_8xb32-coslr-200e](mocov2_resnet50_8xb32-coslr-200e_in1k.py) | feature5 | 84.04 | | | | | | | | |
### ImageNet Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_8xb32-steplr-90e.py](../../benchmarks/classification/imagenet/resnet50_mhead_8xb32-steplr-90e_in1k.py) for details of config.
The **AvgPool** result is obtained from Linear Evaluation with GlobalAveragePooling. Please refer to [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | ---------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [mocov2_resnet50_8xb32-coslr-200e](mocov2_resnet50_8xb32-coslr-200e_in1k.py) | 15.96 | 34.22 | 45.78 | 61.11 | 66.24 | 67.56 |
### iNaturalist2018 Linear Evaluation
Please refer to [resnet50_mhead_8xb32-steplr-84e_inat18.py](../../benchmarks/classification/inaturalist2018/resnet50_mhead_8xb32-steplr-84e_inat18.py) and [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | ---------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [mocov2_resnet50_8xb32-coslr-200e](mocov2_resnet50_8xb32-coslr-200e_in1k.py) | | | | | | |
### Places205 Linear Evaluation
Please refer to [resnet50_mhead_8xb32-steplr-28e_places205.py](../../benchmarks/classification/inaturalist2018/resnet50_mhead_8xb32-steplr-28e_places205.py) and [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | ---------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [mocov2_resnet50_8xb32-coslr-200e](mocov2_resnet50_8xb32-coslr-200e_in1k.py) | | | | | | |
#### Semi-Supervised Classification
- In this benchmark, the necks or heads are removed and only the backbone CNN is evaluated by appending a linear classification head. All parameters are fine-tuned.
- When training with 1% ImageNet, we find hyper-parameters especially the learning rate greatly influence the performance. Hence, we prepare a list of settings with the base learning rate from `{0.001, 0.01, 0.1}` and the learning rate multiplier for the head from `{1, 10, 100}`. We choose the best performing setting for each method. The setting of parameters are indicated in the file name. The learning rate is indicated like `1e-1`, `1e-2`, `1e-3` and the learning rate multiplier is indicated like `head1`, `head10`, `head100`.
- Please use --deterministic in this benchmark.
Please refer to the directories `configs/benchmarks/classification/imagenet/imagenet_1percent/` of 1% data and `configs/benchmarks/classification/imagenet/imagenet_10percent/` 10% data for details.
| Model | Pretrain Config | Fine-tuned Config | Top-1 (%) | Top-5 (%) |
| --------- | ---------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------- | --------- |
| [model]() | [mocov2_resnet50_8xb32-coslr-200e](mocov2_resnet50_8xb32-coslr-200e_in1k.py) | [resnet50_head100_4xb64-steplr1e-2-20e_in1k-1pct.py](../../benchmarks/classification/imagenet/imagenet_1percent/resnet50_head100_4xb64-steplr1e-2-20e_in1k-1pct.py) | 38.97 | 67.69 |
### Detection
The detection benchmarks includes 2 downstream task datasets, **Pascal VOC 2007 + 2012** and **COCO2017**. This benchmark follows the evluation protocols set up by MoCo.
#### Pascal VOC 2007 + 2012
Please refer to [faster_rcnn_r50_c4_mstrain_24k.py](../../benchmarks/mmdetection/voc0712/faster_rcnn_r50_c4_mstrain_24k.py) for details of config.
| Model | Config | mAP | AP50 |
| --------- | ---------------------------------------------------------------------------- | --- | ---- |
| [model]() | [mocov2_resnet50_8xb32-coslr-200e](mocov2_resnet50_8xb32-coslr-200e_in1k.py) | | |
#### COCO2017
Please refer to [mask_rcnn_r50_fpn_mstrain_1x.py](../../benchmarks/mmdetection/coco/mask_rcnn_r50_fpn_mstrain_1x.py) for details of config.
| Model | Config | mAP(Box) | AP50(Box) | AP75(Box) | mAP(Mask) | AP50(Mask) | AP75(Mask) |
| --------- | ---------------------------------------------------------------------------- | -------- | --------- | --------- | --------- | ---------- | ---------- |
| [model]() | [mocov2_resnet50_8xb32-coslr-200e](mocov2_resnet50_8xb32-coslr-200e_in1k.py) | | | | | | |
### Segmentation
The segmentation benchmarks includes 2 downstream task datasets, **Cityscapes** and **Pascal VOC 2012 + Aug**. It follows the evluation protocols set up by MMSegmentation.
#### Pascal VOC 2012 + Aug
Please refer to [file]() for details of config.
| Model | Config | mIOU |
| --------- | ---------------------------------------------------------------------------- | ---- |
| [model]() | [mocov2_resnet50_8xb32-coslr-200e](mocov2_resnet50_8xb32-coslr-200e_in1k.py) | |
#### Cityscapes
Please refer to [file]() for details of config.
| Model | Config | mIOU |
| --------- | ---------------------------------------------------------------------------- | ---- |
| [model]() | [mocov2_resnet50_8xb32-coslr-200e](mocov2_resnet50_8xb32-coslr-200e_in1k.py) | |

View File

@ -16,19 +16,19 @@ This paper does not describe a novel method. Instead, it studies a straightforwa
**Back to [model_zoo.md](https://github.com/open-mmlab/mmselfsup/blob/master/docs/en/model_zoo.md) to download models.**
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models were trained on ImageNet1k dataset.
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models are pre-trained on ImageNet-1k dataset.
### Classification
The classification benchmarks includes 4 downstream task datasets, **VOC**, **ImageNet**, **iNaturalist2018** and **Places205**. If not specified, the results are Top-1 (%).
The classification benchmarks includes 4 downstream task datasets, **VOC**, **ImageNet**, **iNaturalist2018** and **Places205**. If not specified, the results are Top-1 (%).
#### ImageNet Linear Evaluation
The **Linear Evaluation** result is obtained by training a linear head upon the pre-trained backbone. Please refer to [vit-small-p16_8xb128-coslr-90e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/vit-small-p16_8xb128-coslr-90e_in1k.py) for details of config.
| Self-Supervised Config | Linear Evaluation |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------- |
| [vit-small-p16_32xb128-fp16-coslr-300e_in1k-224](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mocov3/mocov3_vit-small-p16_32xb128-fp16-coslr-300e_in1k-224.py) | 73.19 |
| Self-Supervised Config | Linear Evaluation |
| --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------- |
| [vit-small-p16_linear-32xb128-fp16-coslr-300e_in1k-224](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mocov3/mocov3_vit-small-p16_linear-32xb128-fp16-coslr-300e_in1k-224.py) | 73.19 |
## Citation

View File

@ -1,8 +1,10 @@
# NPID
## Unsupervised Feature Learning via Non-Parametric Instance Discrimination
> [Unsupervised Feature Learning via Non-Parametric Instance Discrimination](https://arxiv.org/abs/1805.01978)
<!-- [ABSTRACT] -->
<!-- [ALGORITHM] -->
## Abstract
Neural net classifiers trained on data with annotated class labels can also capture apparent visual similarity among categories without being directed to do so. We study whether this observation can be extended beyond the conventional domain of supervised learning: Can we learn a good feature representation that captures apparent similar- ity among instances, instead of classes, by merely asking the feature to be discriminative of individual instances?
@ -10,14 +12,89 @@ We formulate this intuition as a non-parametric classification problem at the in
Our method is also remarkable for consistently improving test performance with more training data and better network architectures. By fine-tuning the learned feature, we further obtain competitive results for semi-supervised learning and object detection tasks. Our non-parametric model is highly compact: With 128 features per image, our method requires only 600MB storage for a million images, enabling fast nearest neighbour retrieval at the run time.
<!-- [IMAGE] -->
<div align="center">
<img />
<img src="https://user-images.githubusercontent.com/36138628/149722257-1651c283-ac68-4cdc-90e6-970d820529af.png" width="800" />
</div>
## Citation
## Results and Models
<!-- [ALGORITHM] -->
**Back to [model_zoo.md](https://github.com/open-mmlab/mmselfsup/blob/master/docs/en/model_zoo.md) to download models.**
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models are pre-trained on ImageNet-1k dataset.
### Classification
The classification benchmarks includes 4 downstream task datasets, **VOC**, **ImageNet**, **iNaturalist2018** and **Places205**. If not specified, the results are Top-1 (%).
#### VOC SVM / Low-shot SVM
The **Best Layer** indicates that the best results are obtained from which layers feature map. For example, if the **Best Layer** is **feature3**, its best result is obtained from the second stage of ResNet (1 for stem layer, 2-5 for 4 stage layers).
Besides, k=1 to 96 indicates the hyper-parameter of Low-shot SVM.
| Self-Supervised Config | Best Layer | SVM | k=1 | k=2 | k=4 | k=8 | k=16 | k=32 | k=64 | k=96 |
| ---------------------------------------------------------------------------------------------------------------------------------------------- | ---------- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- |
| [resnet50_8xb32-steplr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/npid/npid_resnet50_8xb32-steplr-200e_in1k.py) | feature5 | 76.75 | 26.96 | 35.37 | 44.48 | 53.89 | 60.39 | 66.41 | 71.48 | 73.39 |
#### ImageNet Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_linear-8xb32-steplr-90e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/resnet50_mhead_linear-8xb32-steplr-90e_in1k.py) for details of config.
The **AvgPool** result is obtained from Linear Evaluation with GlobalAveragePooling. Please refer to [resnet50_linear-8xb32-steplr-100e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/resnet50_linear-8xb32-steplr-100e_in1k.py) for details of config.
| Self-Supervised Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| ---------------------------------------------------------------------------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [resnet50_8xb32-steplr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/npid/npid_resnet50_8xb32-steplr-200e_in1k.py) | 14.68 | 31.98 | 42.85 | 56.95 | 58.41 | 57.97 |
#### Places205 Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_8xb32-steplr-28e_places205.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/places205/resnet50_mhead_8xb32-steplr-28e_places205.py) for details of config.
| Self-Supervised Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 |
| ---------------------------------------------------------------------------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- |
| [resnet50_8xb32-steplr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/npid/npid_resnet50_8xb32-steplr-200e_in1k.py) | 19.98 | 34.86 | 41.59 | 48.43 | 48.71 |
#### ImageNet Nearest-Neighbor Classification
The results are obtained from the features after GlobalAveragePooling. Here, k=10 to 200 indicates different number of nearest neighbors.
| Self-Supervised Config | k=10 | k=20 | k=100 | k=200 |
| ---------------------------------------------------------------------------------------------------------------------------------------------- | ---- | ---- | ----- | ----- |
| [resnet50_8xb32-steplr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/npid/npid_resnet50_8xb32-steplr-200e_in1k.py) | 42.9 | 44.0 | 43.2 | 42.2 |
### Detection
The detection benchmarks includes 2 downstream task datasets, **Pascal VOC 2007 + 2012** and **COCO2017**. This benchmark follows the evluation protocols set up by MoCo.
#### Pascal VOC 2007 + 2012
Please refer to [faster_rcnn_r50_c4_mstrain_24k_voc0712.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmdetection/voc0712/faster_rcnn_r50_c4_mstrain_24k_voc0712.py) for details of config.
| Self-Supervised Config | AP50 |
| ---------------------------------------------------------------------------------------------------------------------------------------------- | ----- |
| [resnet50_8xb32-steplr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/npid/npid_resnet50_8xb32-steplr-200e_in1k.py) | 79.52 |
#### COCO2017
Please refer to [mask_rcnn_r50_fpn_mstrain_1x_coco.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmdetection/coco/mask_rcnn_r50_fpn_mstrain_1x_coco.py) for details of config.
| Self-Supervised Config | mAP(Box) | AP50(Box) | AP75(Box) | mAP(Mask) | AP50(Mask) | AP75(Mask) |
| ---------------------------------------------------------------------------------------------------------------------------------------------- | -------- | --------- | --------- | --------- | ---------- | ---------- |
| [resnet50_8xb32-steplr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/npid/npid_resnet50_8xb32-steplr-200e_in1k.py) | 38.5 | 57.7 | 42.0 | 34.6 | 54.8 | 37.1 |
### Segmentation
The segmentation benchmarks includes 2 downstream task datasets, **Cityscapes** and **Pascal VOC 2012 + Aug**. It follows the evluation protocols set up by MMSegmentation.
#### Pascal VOC 2012 + Aug
Please refer to [fcn_r50-d8_512x512_20k_voc12aug.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmsegmentation/voc12aug/fcn_r50-d8_512x512_20k_voc12aug.py) for details of config.
| Self-Supervised Config | mIOU |
| ---------------------------------------------------------------------------------------------------------------------------------------------- | ----- |
| [resnet50_8xb32-steplr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/npid/npid_resnet50_8xb32-steplr-200e_in1k.py) | 65.45 |
## Citation
```bibtex
@inproceedings{wu2018unsupervised,
@ -27,99 +104,3 @@ Our method is also remarkable for consistently improving test performance with m
year={2018}
}
```
## Models and Benchmarks
[Back to model_zoo.md](../../../docs/model_zoo.md)
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models were trained on ImageNet1k dataset.
### VOC SVM / Low-shot SVM
The **Best Layer** indicates that the best results are obtained from which layers feature map. For example, if the **Best Layer** is **feature3**, its best result is obtained from the second stage of ResNet (1 for stem layer, 2-5 for 4 stage layers).
Besides, k=1 to 96 indicates the hyper-parameter of Low-shot SVM.
| Model | Config | Best Layer | SVM | k=1 | k=2 | k=4 | k=8 | k=16 | k=32 | k=64 | k=96 |
| --------- | --------------------------------------------------------------------- | ---------- | --- | --- | --- | --- | --- | ---- | ---- | ---- | ---- |
| [model]() | [resnet50_8xb32-steplr-200e](npid_resnet50_8xb32-steplr-200e_in1k.py) | | | | | | | | | | |
### ImageNet Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_8xb32-steplr-90e.py](../../benchmarks/classification/imagenet/resnet50_mhead_8xb32-steplr-90e_in1k.py) for details of config.
The **AvgPool** result is obtained from Linear Evaluation with GlobalAveragePooling. Please refer to [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | --------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb32-steplr-200e](npid_resnet50_8xb32-steplr-200e_in1k.py) | | | | | | |
### iNaturalist2018 Linear Evaluation
Please refer to [resnet50_mhead_8xb32-steplr-84e_inat18.py](../../benchmarks/classification/inaturalist2018/resnet50_mhead_8xb32-steplr-84e_inat18.py) and [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | --------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb32-steplr-200e](npid_resnet50_8xb32-steplr-200e_in1k.py) | | | | | | |
### Places205 Linear Evaluation
Please refer to [resnet50_mhead_8xb32-steplr-28e_places205.py](../../benchmarks/classification/inaturalist2018/resnet50_mhead_8xb32-steplr-28e_places205.py) and [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | --------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb32-steplr-200e](npid_resnet50_8xb32-steplr-200e_in1k.py) | | | | | | |
#### Semi-Supervised Classification
- In this benchmark, the necks or heads are removed and only the backbone CNN is evaluated by appending a linear classification head. All parameters are fine-tuned.
- When training with 1% ImageNet, we find hyper-parameters especially the learning rate greatly influence the performance. Hence, we prepare a list of settings with the base learning rate from `{0.001, 0.01, 0.1}` and the learning rate multiplier for the head from `{1, 10, 100}`. We choose the best performing setting for each method. The setting of parameters are indicated in the file name. The learning rate is indicated like `1e-1`, `1e-2`, `1e-3` and the learning rate multiplier is indicated like `head1`, `head10`, `head100`.
- Please use --deterministic in this benchmark.
Please refer to the directories `configs/benchmarks/classification/imagenet/imagenet_1percent/` of 1% data and `configs/benchmarks/classification/imagenet/imagenet_10percent/` 10% data for details.
| Model | Pretrain Config | Fine-tuned Config | Top-1 (%) | Top-5 (%) |
| --------- | --------------------------------------------------------------------- | ----------------- | --------- | --------- |
| [model]() | [resnet50_8xb32-steplr-200e](npid_resnet50_8xb32-steplr-200e_in1k.py) | | | |
### Detection
The detection benchmarks includes 2 downstream task datasets, **Pascal VOC 2007 + 2012** and **COCO2017**. This benchmark follows the evluation protocols set up by MoCo.
#### Pascal VOC 2007 + 2012
Please refer to [faster_rcnn_r50_c4_mstrain_24k.py](../../benchmarks/mmdetection/voc0712/faster_rcnn_r50_c4_mstrain_24k.py) for details of config.
| Model | Config | mAP | AP50 |
| --------- | --------------------------------------------------------------------- | --- | ---- |
| [model]() | [resnet50_8xb32-steplr-200e](npid_resnet50_8xb32-steplr-200e_in1k.py) | | |
#### COCO2017
Please refer to [mask_rcnn_r50_fpn_mstrain_1x.py](../../benchmarks/mmdetection/coco/mask_rcnn_r50_fpn_mstrain_1x.py) for details of config.
| Model | Config | mAP(Box) | AP50(Box) | AP75(Box) | mAP(Mask) | AP50(Mask) | AP75(Mask) |
| --------- | --------------------------------------------------------------------- | -------- | --------- | --------- | --------- | ---------- | ---------- |
| [model]() | [resnet50_8xb32-steplr-200e](npid_resnet50_8xb32-steplr-200e_in1k.py) | | | | | | |
### Segmentation
The segmentation benchmarks includes 2 downstream task datasets, **Cityscapes** and **Pascal VOC 2012 + Aug**. It follows the evluation protocols set up by MMSegmentation.
#### Pascal VOC 2012 + Aug
Please refer to [file]() for details of config.
| Model | Config | mIOU |
| --------- | --------------------------------------------------------------------- | ---- |
| [model]() | [resnet50_8xb32-steplr-200e](npid_resnet50_8xb32-steplr-200e_in1k.py) | |
#### Cityscapes
Please refer to [file]() for details of config.
| Model | Config | mIOU |
| --------- | --------------------------------------------------------------------- | ---- |
| [model]() | [resnet50_8xb32-steplr-200e](npid_resnet50_8xb32-steplr-200e_in1k.py) | |

View File

@ -1,19 +1,64 @@
# ODC
## Online Deep Clustering for Unsupervised Representation Learning
> [Online Deep Clustering for Unsupervised Representation Learning](https://arxiv.org/abs/2006.10645)
<!-- [ABSTRACT] -->
<!-- [ALGORITHM] -->
## Abstract
Joint clustering and feature learning methods have shown remarkable performance in unsupervised representation learning. However, the training schedule alternating between feature clustering and network parameters update leads to unstable learning of visual representations. To overcome this challenge, we propose Online Deep Clustering (ODC) that performs clustering and network update simultaneously rather than alternatingly. Our key insight is that the cluster centroids should evolve steadily in keeping the classifier stably updated. Specifically, we design and maintain two dynamic memory modules, i.e., samples memory to store samples labels and features, and centroids memory for centroids evolution. We break down the abrupt global clustering into steady memory update and batch-wise label re-assignment. The process is integrated into network update iterations. In this way, labels and the network evolve shoulder-to-shoulder rather than alternatingly. Extensive experiments demonstrate that ODC stabilizes the training process and boosts the performance effectively.
<!-- [IMAGE] -->
<div align="center">
<img />
<img src="https://user-images.githubusercontent.com/36138628/149722645-8da8e5b2-8846-4554-aa3e-727d286b85cd.png" width="700" />
</div>
## Citation
## Results and Models
<!-- [ALGORITHM] -->
**Back to [model_zoo.md](https://github.com/open-mmlab/mmselfsup/blob/master/docs/en/model_zoo.md) to download models.**
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models are pre-trained on ImageNet-1k dataset.
### Classification
The classification benchmarks includes 4 downstream task datasets, **VOC**, **ImageNet**, **iNaturalist2018** and **Places205**. If not specified, the results are Top-1 (%).
#### VOC SVM / Low-shot SVM
The **Best Layer** indicates that the best results are obtained from which layers feature map. For example, if the **Best Layer** is **feature3**, its best result is obtained from the second stage of ResNet (1 for stem layer, 2-5 for 4 stage layers).
Besides, k=1 to 96 indicates the hyper-parameter of Low-shot SVM.
| Self-Supervised Config | Best Layer | SVM | k=1 | k=2 | k=4 | k=8 | k=16 | k=32 | k=64 | k=96 |
| -------------------------------------------------------------------------------------------------------------------------------------------- | ---------- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- |
| [resnet50_8xb64-steplr-440e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/odc/odc_resnet50_8xb64-steplr-440e_in1k.py) | feature5 | 78.42 | 32.42 | 40.27 | 49.95 | 59.96 | 65.71 | 69.99 | 73.64 | 75.13 |
#### ImageNet Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_linear-8xb32-steplr-90e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/resnet50_mhead_linear-8xb32-steplr-90e_in1k.py) for details of config.
The **AvgPool** result is obtained from Linear Evaluation with GlobalAveragePooling. Please refer to [resnet50_linear-8xb32-steplr-100e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/resnet50_linear-8xb32-steplr-100e_in1k.py) for details of config.
| Self-Supervised Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| -------------------------------------------------------------------------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [resnet50_8xb64-steplr-440e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/odc/odc_resnet50_8xb64-steplr-440e_in1k.py) | 14.76 | 31.82 | 42.44 | 55.76 | 57.70 | 53.42 |
#### Places205 Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_8xb32-steplr-28e_places205.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/places205/resnet50_mhead_8xb32-steplr-28e_places205.py) for details of config.
| Self-Supervised Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 |
| -------------------------------------------------------------------------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- |
| [resnet50_8xb64-steplr-440e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/odc/odc_resnet50_8xb64-steplr-440e_in1k.py) | 19.28 | 34.09 | 40.90 | 47.04 | 48.35 |
#### ImageNet Nearest-Neighbor Classification
The results are obtained from the features after GlobalAveragePooling. Here, k=10 to 200 indicates different number of nearest neighbors.
| Self-Supervised Config | k=10 | k=20 | k=100 | k=200 |
| -------------------------------------------------------------------------------------------------------------------------------------------- | ---- | ---- | ----- | ----- |
| [resnet50_8xb64-steplr-440e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/odc/odc_resnet50_8xb64-steplr-440e_in1k.py) | 38.5 | 39.1 | 37.8 | 36.9 |
## Citation
```bibtex
@inproceedings{zhan2020online,
@ -23,100 +68,3 @@ Joint clustering and feature learning methods have shown remarkable performance
year={2020}
}
```
## Models and Benchmarks
[Back to model_zoo.md](../../../docs/model_zoo.md)
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models were trained on ImageNet1k dataset.
### VOC SVM / Low-shot SVM
The **Best Layer** indicates that the best results are obtained from which layers feature map. For example, if the **Best Layer** is **feature3**, its best result is obtained from the second stage of ResNet (1 for stem layer, 2-5 for 4 stage layers).
Besides, k=1 to 96 indicates the hyper-parameter of Low-shot SVM.
| Model | Config | Best Layer | SVM | k=1 | k=2 | k=4 | k=8 | k=16 | k=32 | k=64 | k=96 |
| --------- | -------------------------------------------------------------------- | ---------- | --- | --- | --- | --- | --- | ---- | ---- | ---- | ---- |
| [model]() | [resnet50_8xb64-steplr-440e](odc_resnet50_8xb64-steplr-440e_in1k.py) | | | | | | | | | | |
### ImageNet Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_8xb32-steplr-90e.py](../../benchmarks/classification/imagenet/resnet50_mhead_8xb32-steplr-90e_in1k.py) for details of config.
The **AvgPool** result is obtained from Linear Evaluation with GlobalAveragePooling. Please refer to [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | -------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb64-steplr-440e](odc_resnet50_8xb64-steplr-440e_in1k.py) | | | | | | |
### iNaturalist2018 Linear Evaluation
Please refer to [resnet50_mhead_8xb32-steplr-84e_inat18.py](../../benchmarks/classification/inaturalist2018/resnet50_mhead_8xb32-steplr-84e_inat18.py) and [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | -------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb64-steplr-440e](odc_resnet50_8xb64-steplr-440e_in1k.py) | | | | | | |
### Places205 Linear Evaluation
Please refer to [resnet50_mhead_8xb32-steplr-28e_places205.py](../../benchmarks/classification/inaturalist2018/resnet50_mhead_8xb32-steplr-28e_places205.py) and [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | -------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb64-steplr-440e](odc_resnet50_8xb64-steplr-440e_in1k.py) | | | | | | |
#### Semi-Supervised Classification
- In this benchmark, the necks or heads are removed and only the backbone CNN is evaluated by appending a linear classification head. All parameters are fine-tuned.
- When training with 1% ImageNet, we find hyper-parameters especially the learning rate greatly influence the performance. Hence, we prepare a list of settings with the base learning rate from `{0.001, 0.01, 0.1}` and the learning rate multiplier for the head from `{1, 10, 100}`. We choose the best performing setting for each method. The setting of parameters are indicated in the file name. The learning rate is indicated like `1e-1`, `1e-2`, `1e-3` and the learning rate multiplier is indicated like `head1`, `head10`, `head100`.
- Please use --deterministic in this benchmark.
Please refer to the directories `configs/benchmarks/classification/imagenet/imagenet_1percent/` of 1% data and `configs/benchmarks/classification/imagenet/imagenet_10percent/` 10% data for details.
| Model | Pretrain Config | Fine-tuned Config | Top-1 (%) | Top-5 (%) |
| --------- | -------------------------------------------------------------------- | ----------------- | --------- | --------- |
| [model]() | [resnet50_8xb64-steplr-440e](odc_resnet50_8xb64-steplr-440e_in1k.py) | | | |
### Detection
The detection benchmarks includes 2 downstream task datasets, **Pascal VOC 2007 + 2012** and **COCO2017**. This benchmark follows the evluation protocols set up by MoCo.
#### Pascal VOC 2007 + 2012
Please refer to [faster_rcnn_r50_c4_mstrain_24k.py](../../benchmarks/mmdetection/voc0712/faster_rcnn_r50_c4_mstrain_24k.py) for details of config.
| Model | Config | mAP | AP50 |
| --------- | -------------------------------------------------------------------- | --- | ---- |
| [model]() | [resnet50_8xb64-steplr-440e](odc_resnet50_8xb64-steplr-440e_in1k.py) | | |
#### COCO2017
Please refer to [mask_rcnn_r50_fpn_mstrain_1x.py](../../benchmarks/mmdetection/coco/mask_rcnn_r50_fpn_mstrain_1x.py) for details of config.
| Model | Config | mAP(Box) | AP50(Box) | AP75(Box) | mAP(Mask) | AP50(Mask) | AP75(Mask) |
| --------- | -------------------------------------------------------------------- | -------- | --------- | --------- | --------- | ---------- | ---------- |
| [model]() | [resnet50_8xb64-steplr-440e](odc_resnet50_8xb64-steplr-440e_in1k.py) | | | | | | |
### Segmentation
The segmentation benchmarks includes 2 downstream task datasets, **Cityscapes** and **Pascal VOC 2012 + Aug**. It follows the evluation protocols set up by MMSegmentation.
#### Pascal VOC 2012 + Aug
Please refer to [file]() for details of config.
| Model | Config | mIOU |
| --------- | -------------------------------------------------------------------- | ---- |
| [model]() | [resnet50_8xb64-steplr-440e](odc_resnet50_8xb64-steplr-440e_in1k.py) | |
#### Cityscapes
Please refer to [file]() for details of config.
| Model | Config | mIOU |
| --------- | -------------------------------------------------------------------- | ---- |
| [model]() | [resnet50_8xb64-steplr-440e](odc_resnet50_8xb64-steplr-440e_in1k.py) | |

View File

@ -1,19 +1,96 @@
# Relative Location
## Unsupervised Visual Representation Learning by Context Prediction
> [Unsupervised Visual Representation Learning by Context Prediction](https://arxiv.org/abs/1505.05192)
<!-- [ABSTRACT] -->
<!-- [ALGORITHM] -->
## Abstract
This work explores the use of spatial context as a source of free and plentiful supervisory signal for training a rich visual representation. Given only a large, unlabeled image collection, we extract random pairs of patches from each image and train a convolutional neural net to predict the position of the second patch relative to the first. We argue that doing well on this task requires the model to learn to recognize objects and their parts. We demonstrate that the feature representation learned using this within-image context indeed captures visual similarity across images. For example, this representation allows us to perform unsupervised visual discovery of objects like cats, people, and even birds from the Pascal VOC 2011 detection dataset. Furthermore, we show that the learned ConvNet can be used in the RCNN framework and provides a significant boost over a randomly-initialized ConvNet, resulting in state-of-the-art performance among algorithms which use only Pascal-provided training set annotations.
<!-- [IMAGE] -->
<div align="center">
<img />
<img src="https://user-images.githubusercontent.com/36138628/149723222-76bc89e8-98bf-4ed7-b179-dfe5bc6336ba.png" width="400" />
</div>
## Citation
## Results and Models
<!-- [ALGORITHM] -->
**Back to [model_zoo.md](https://github.com/open-mmlab/mmselfsup/blob/master/docs/en/model_zoo.md) to download models.**
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models are pre-trained on ImageNet-1k dataset.
### Classification
The classification benchmarks includes 4 downstream task datasets, **VOC**, **ImageNet**, **iNaturalist2018** and **Places205**. If not specified, the results are Top-1 (%).
#### VOC SVM / Low-shot SVM
The **Best Layer** indicates that the best results are obtained from which layers feature map. For example, if the **Best Layer** is **feature3**, its best result is obtained from the second stage of ResNet (1 for stem layer, 2-5 for 4 stage layers).
Besides, k=1 to 96 indicates the hyper-parameter of Low-shot SVM.
| Self-Supervised Config | Best Layer | SVM | k=1 | k=2 | k=4 | k=8 | k=16 | k=32 | k=64 | k=96 |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------ | ---------- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- |
| [resnet50_8xb64-steplr-70e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/relative_loc/relative-loc_resnet50_8xb64-steplr-70e_in1k.py) | feature4 | 65.52 | 20.36 | 23.12 | 30.66 | 37.02 | 42.55 | 50.00 | 55.58 | 59.28 |
#### ImageNet Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_linear-8xb32-steplr-90e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/resnet50_mhead_linear-8xb32-steplr-90e_in1k.py) for details of config.
The **AvgPool** result is obtained from Linear Evaluation with GlobalAveragePooling. Please refer to [resnet50_linear-8xb32-steplr-100e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/resnet50_linear-8xb32-steplr-100e_in1k.py) for details of config.
| Self-Supervised Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------ | -------- | -------- | -------- | -------- | -------- | ------- |
| [resnet50_8xb64-steplr-70e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/relative_loc/relative-loc_resnet50_8xb64-steplr-70e_in1k.py) | 15.11 | 30.47 | 42.83 | 51.20 | 40.96 | 39.65 |
#### Places205 Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_8xb32-steplr-28e_places205.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/places205/resnet50_mhead_8xb32-steplr-28e_places205.py) for details of config.
| Self-Supervised Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------ | -------- | -------- | -------- | -------- | -------- |
| [resnet50_8xb64-steplr-70e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/relative_loc/relative-loc_resnet50_8xb64-steplr-70e_in1k.py) | 20.69 | 34.72 | 43.01 | 45.97 | 41.96 |
#### ImageNet Nearest-Neighbor Classification
The results are obtained from the features after GlobalAveragePooling. Here, k=10 to 200 indicates different number of nearest neighbors.
| Self-Supervised Config | k=10 | k=20 | k=100 | k=200 |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------ | ---- | ---- | ----- | ----- |
| [resnet50_8xb64-steplr-70e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/relative_loc/relative-loc_resnet50_8xb64-steplr-70e_in1k.py) | 14.5 | 15.0 | 15.0 | 14.2 |
### Detection
The detection benchmarks includes 2 downstream task datasets, **Pascal VOC 2007 + 2012** and **COCO2017**. This benchmark follows the evluation protocols set up by MoCo.
#### Pascal VOC 2007 + 2012
Please refer to [faster_rcnn_r50_c4_mstrain_24k_voc0712.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmdetection/voc0712/faster_rcnn_r50_c4_mstrain_24k_voc0712.py) for details of config.
| Self-Supervised Config | AP50 |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------ | ----- |
| [resnet50_8xb64-steplr-70e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/relative_loc/relative-loc_resnet50_8xb64-steplr-70e_in1k.py) | 79.70 |
#### COCO2017
Please refer to [mask_rcnn_r50_fpn_mstrain_1x_coco.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmdetection/coco/mask_rcnn_r50_fpn_mstrain_1x_coco.py) for details of config.
| Self-Supervised Config | mAP(Box) | AP50(Box) | AP75(Box) | mAP(Mask) | AP50(Mask) | AP75(Mask) |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------ | -------- | --------- | --------- | --------- | ---------- | ---------- |
| [resnet50_8xb64-steplr-70e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/relative_loc/relative-loc_resnet50_8xb64-steplr-70e_in1k.py) | 37.5 | 56.2 | 41.3 | 33.7 | 53.3 | 36.1 |
### Segmentation
The segmentation benchmarks includes 2 downstream task datasets, **Cityscapes** and **Pascal VOC 2012 + Aug**. It follows the evluation protocols set up by MMSegmentation.
#### Pascal VOC 2012 + Aug
Please refer to [fcn_r50-d8_512x512_20k_voc12aug.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmsegmentation/voc12aug/fcn_r50-d8_512x512_20k_voc12aug.py) for details of config.
| Self-Supervised Config | mIOU |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------ | ----- |
| [resnet50_8xb64-steplr-70e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/relative_loc/relative-loc_resnet50_8xb64-steplr-70e_in1k.py) | 63.49 |
## Citation
```bibtex
@inproceedings{doersch2015unsupervised,
@ -23,100 +100,3 @@ This work explores the use of spatial context as a source of free and plentiful
year={2015}
}
```
## Models and Benchmarks
[Back to model_zoo.md](../../../docs/model_zoo.md)
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models were trained on ImageNet1k dataset.
### VOC SVM / Low-shot SVM
The **Best Layer** indicates that the best results are obtained from which layers feature map. For example, if the **Best Layer** is **feature3**, its best result is obtained from the second stage of ResNet (1 for stem layer, 2-5 for 4 stage layers).
Besides, k=1 to 96 indicates the hyper-parameter of Low-shot SVM.
| Model | Config | Best Layer | SVM | k=1 | k=2 | k=4 | k=8 | k=16 | k=32 | k=64 | k=96 |
| --------- | --------------------------------------------------------------------------- | ---------- | --- | --- | --- | --- | --- | ---- | ---- | ---- | ---- |
| [model]() | [resnet50_8xb64-steplr-70e](relative-loc_resnet50_8xb64-steplr-70e_in1k.py) | | | | | | | | | | |
### ImageNet Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_8xb32-steplr-90e.py](../../benchmarks/classification/imagenet/resnet50_mhead_8xb32-steplr-90e_in1k.py) for details of config.
The **AvgPool** result is obtained from Linear Evaluation with GlobalAveragePooling. Please refer to [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | --------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb64-steplr-70e](relative-loc_resnet50_8xb64-steplr-70e_in1k.py) | | | | | | |
### iNaturalist2018 Linear Evaluation
Please refer to [resnet50_mhead_8xb32-steplr-84e_inat18.py](../../benchmarks/classification/inaturalist2018/resnet50_mhead_8xb32-steplr-84e_inat18.py) and [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | --------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb64-steplr-70e](relative-loc_resnet50_8xb64-steplr-70e_in1k.py) | | | | | | |
### Places205 Linear Evaluation
Please refer to [resnet50_mhead_8xb32-steplr-28e_places205.py](../../benchmarks/classification/inaturalist2018/resnet50_mhead_8xb32-steplr-28e_places205.py) and [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | --------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb64-steplr-70e](relative-loc_resnet50_8xb64-steplr-70e_in1k.py) | | | | | | |
#### Semi-Supervised Classification
- In this benchmark, the necks or heads are removed and only the backbone CNN is evaluated by appending a linear classification head. All parameters are fine-tuned.
- When training with 1% ImageNet, we find hyper-parameters especially the learning rate greatly influence the performance. Hence, we prepare a list of settings with the base learning rate from `{0.001, 0.01, 0.1}` and the learning rate multiplier for the head from `{1, 10, 100}`. We choose the best performing setting for each method. The setting of parameters are indicated in the file name. The learning rate is indicated like `1e-1`, `1e-2`, `1e-3` and the learning rate multiplier is indicated like `head1`, `head10`, `head100`.
- Please use --deterministic in this benchmark.
Please refer to the directories `configs/benchmarks/classification/imagenet/imagenet_1percent/` of 1% data and `configs/benchmarks/classification/imagenet/imagenet_10percent/` 10% data for details.
| Model | Pretrain Config | Fine-tuned Config | Top-1 (%) | Top-5 (%) |
| --------- | --------------------------------------------------------------------------- | ----------------- | --------- | --------- |
| [model]() | [resnet50_8xb64-steplr-70e](relative-loc_resnet50_8xb64-steplr-70e_in1k.py) | | | |
### Detection
The detection benchmarks includes 2 downstream task datasets, **Pascal VOC 2007 + 2012** and **COCO2017**. This benchmark follows the evluation protocols set up by MoCo.
#### Pascal VOC 2007 + 2012
Please refer to [faster_rcnn_r50_c4_mstrain_24k.py](../../benchmarks/mmdetection/voc0712/faster_rcnn_r50_c4_mstrain_24k.py) for details of config.
| Model | Config | mAP | AP50 |
| --------- | --------------------------------------------------------------------------- | --- | ---- |
| [model]() | [resnet50_8xb64-steplr-70e](relative-loc_resnet50_8xb64-steplr-70e_in1k.py) | | |
#### COCO2017
Please refer to [mask_rcnn_r50_fpn_mstrain_1x.py](../../benchmarks/mmdetection/coco/mask_rcnn_r50_fpn_mstrain_1x.py) for details of config.
| Model | Config | mAP(Box) | AP50(Box) | AP75(Box) | mAP(Mask) | AP50(Mask) | AP75(Mask) |
| --------- | --------------------------------------------------------------------------- | -------- | --------- | --------- | --------- | ---------- | ---------- |
| [model]() | [resnet50_8xb64-steplr-70e](relative-loc_resnet50_8xb64-steplr-70e_in1k.py) | | | | | | |
### Segmentation
The segmentation benchmarks includes 2 downstream task datasets, **Cityscapes** and **Pascal VOC 2012 + Aug**. It follows the evluation protocols set up by MMSegmentation.
#### Pascal VOC 2012 + Aug
Please refer to [file]() for details of config.
| Model | Config | mIOU |
| --------- | --------------------------------------------------------------------------- | ---- |
| [model]() | [resnet50_8xb64-steplr-70e](relative-loc_resnet50_8xb64-steplr-70e_in1k.py) | |
#### Cityscapes
Please refer to [file]() for details of config.
| Model | Config | mIOU |
| --------- | --------------------------------------------------------------------------- | ---- |
| [model]() | [resnet50_8xb64-steplr-70e](relative-loc_resnet50_8xb64-steplr-70e_in1k.py) | |

View File

@ -1,19 +1,96 @@
# Rotation Prediction
## Unsupervised Representation Learning by Predicting Image Rotation
> [Unsupervised Representation Learning by Predicting Image Rotation](https://arxiv.org/abs/1803.07728)
<!-- [ABSTRACT] -->
<!-- [ALGORITHM] -->
## Abstract
Over the last years, deep convolutional neural networks (ConvNets) have transformed the field of computer vision thanks to their unparalleled capacity to learn high level semantic image features. However, in order to successfully learn those features, they usually require massive amounts of manually labeled data, which is both expensive and impractical to scale. Therefore, unsupervised semantic feature learning, i.e., learning without requiring manual annotation effort, is of crucial importance in order to successfully harvest the vast amount of visual data that are available today. In our work we propose to learn image features by training ConvNets to recognize the 2d rotation that is applied to the image that it gets as input. We demonstrate both qualitatively and quantitatively that this apparently simple task actually provides a very powerful supervisory signal for semantic feature learning. We exhaustively evaluate our method in various unsupervised feature learning benchmarks and we exhibit in all of them state-of-the-art performance. Specifically, our results on those benchmarks demonstrate dramatic improvements w.r.t. prior state-of-the-art approaches in unsupervised representation learning and thus significantly close the gap with supervised feature learning.
<!-- [IMAGE] -->
<div align="center">
<img />
<img src="https://user-images.githubusercontent.com/36138628/149723477-8f63e237-362e-4962-b405-9bab0f579808.png" width="700" />
</div>
## Citation
## Results and Models
<!-- [ALGORITHM] -->
**Back to [model_zoo.md](https://github.com/open-mmlab/mmselfsup/blob/master/docs/en/model_zoo.md) to download models.**
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models are pre-trained on ImageNet-1k dataset.
### Classification
The classification benchmarks includes 4 downstream task datasets, **VOC**, **ImageNet**, **iNaturalist2018** and **Places205**. If not specified, the results are Top-1 (%).
#### VOC SVM / Low-shot SVM
The **Best Layer** indicates that the best results are obtained from which layers feature map. For example, if the **Best Layer** is **feature3**, its best result is obtained from the second stage of ResNet (1 for stem layer, 2-5 for 4 stage layers).
Besides, k=1 to 96 indicates the hyper-parameter of Low-shot SVM.
| Self-Supervised Config | Best Layer | SVM | k=1 | k=2 | k=4 | k=8 | k=16 | k=32 | k=64 | k=96 |
| -------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- |
| [resnet50_8xb16-steplr-70e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/rotation_pred/rotation-pred_resnet50_8xb16-steplr-70e_in1k.py) | feature4 | 67.70 | 20.60 | 24.35 | 31.41 | 39.17 | 46.56 | 53.37 | 59.14 | 62.42 |
#### ImageNet Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_linear-8xb32-steplr-90e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/resnet50_mhead_linear-8xb32-steplr-90e_in1k.py) for details of config.
The **AvgPool** result is obtained from Linear Evaluation with GlobalAveragePooling. Please refer to [resnet50_linear-8xb32-steplr-100e_in1k.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/resnet50_linear-8xb32-steplr-100e_in1k.py) for details of config.
| Self-Supervised Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| -------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [resnet50_8xb16-steplr-70e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/rotation_pred/rotation-pred_resnet50_8xb16-steplr-70e_in1k.py) | 12.15 | 31.99 | 44.57 | 54.20 | 45.94 | 48.12 |
#### Places205 Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_8xb32-steplr-28e_places205.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/places205/resnet50_mhead_8xb32-steplr-28e_places205.py) for details of config.
| Self-Supervised Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 |
| -------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- |
| [resnet50_8xb16-steplr-70e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/rotation_pred/rotation-pred_resnet50_8xb16-steplr-70e_in1k.py) | 18.94 | 34.72 | 44.53 | 46.30 | 44.12 |
#### ImageNet Nearest-Neighbor Classification
The results are obtained from the features after GlobalAveragePooling. Here, k=10 to 200 indicates different number of nearest neighbors.
| Self-Supervised Config | k=10 | k=20 | k=100 | k=200 |
| -------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---- | ---- | ----- | ----- |
| [resnet50_8xb16-steplr-70e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/rotation_pred/rotation-pred_resnet50_8xb16-steplr-70e_in1k.py) | 11.0 | 11.9 | 12.6 | 12.4 |
### Detection
The detection benchmarks includes 2 downstream task datasets, **Pascal VOC 2007 + 2012** and **COCO2017**. This benchmark follows the evluation protocols set up by MoCo.
#### Pascal VOC 2007 + 2012
Please refer to [faster_rcnn_r50_c4_mstrain_24k_voc0712.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmdetection/voc0712/faster_rcnn_r50_c4_mstrain_24k_voc0712.py) for details of config.
| Self-Supervised Config | AP50 |
| -------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----- |
| [resnet50_8xb16-steplr-70e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/rotation_pred/rotation-pred_resnet50_8xb16-steplr-70e_in1k.py) | 79.67 |
#### COCO2017
Please refer to [mask_rcnn_r50_fpn_mstrain_1x_coco.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmdetection/coco/mask_rcnn_r50_fpn_mstrain_1x_coco.py) for details of config.
| Self-Supervised Config | mAP(Box) | AP50(Box) | AP75(Box) | mAP(Mask) | AP50(Mask) | AP75(Mask) |
| -------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------- | --------- | --------- | --------- | ---------- | ---------- |
| [resnet50_8xb16-steplr-70e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/rotation_pred/rotation-pred_resnet50_8xb16-steplr-70e_in1k.py) | 37.9 | 56.5 | 41.5 | 34.2 | 53.9 | 36.7 |
### Segmentation
The segmentation benchmarks includes 2 downstream task datasets, **Cityscapes** and **Pascal VOC 2012 + Aug**. It follows the evluation protocols set up by MMSegmentation.
#### Pascal VOC 2012 + Aug
Please refer to [fcn_r50-d8_512x512_20k_voc12aug.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmsegmentation/voc12aug/fcn_r50-d8_512x512_20k_voc12aug.py) for details of config.
| Self-Supervised Config | mIOU |
| -------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----- |
| [resnet50_8xb16-steplr-70e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/rotation_pred/rotation-pred_resnet50_8xb16-steplr-70e_in1k.py) | 64.31 |
## Citation
```bibtex
@inproceedings{komodakis2018unsupervised,
@ -23,99 +100,3 @@ Over the last years, deep convolutional neural networks (ConvNets) have transfor
year={2018}
}
```
## Models and Benchmarks
[Back to model_zoo.md](../../../docs/model_zoo.md)
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models were trained on ImageNet1k dataset.
### VOC SVM / Low-shot SVM
The **Best Layer** indicates that the best results are obtained from which layers feature map. For example, if the **Best Layer** is **feature3**, its best result is obtained from the second stage of ResNet (1 for stem layer, 2-5 for 4 stage layers).
Besides, k=1 to 96 indicates the hyper-parameter of Low-shot SVM.
| Model | Config | Best Layer | SVM | k=1 | k=2 | k=4 | k=8 | k=16 | k=32 | k=64 | k=96 |
| --------- | ---------------------------------------------------------------------------- | ---------- | --- | --- | --- | --- | --- | ---- | ---- | ---- | ---- |
| [model]() | [resnet50_8xb16-steplr-70e](rotation-pred_resnet50_8xb16-steplr-70e_in1k.py) | | | | | | | | | | |
### ImageNet Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_8xb32-steplr-90e.py](../../benchmarks/classification/imagenet/resnet50_mhead_8xb32-steplr-90e_in1k.py) for details of config.
The **AvgPool** result is obtained from Linear Evaluation with GlobalAveragePooling. Please refer to [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | ---------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb16-steplr-70e](rotation-pred_resnet50_8xb16-steplr-70e_in1k.py) | | | | | | |
### iNaturalist2018 Linear Evaluation
Please refer to [resnet50_mhead_8xb32-steplr-84e_inat18.py](../../benchmarks/classification/inaturalist2018/resnet50_mhead_8xb32-steplr-84e_inat18.py) and [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | ---------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb16-steplr-70e](rotation-pred_resnet50_8xb16-steplr-70e_in1k.py) | | | | | | |
### Places205 Linear Evaluation
Please refer to [resnet50_mhead_8xb32-steplr-28e_places205.py](../../benchmarks/classification/inaturalist2018/resnet50_mhead_8xb32-steplr-28e_places205.py) and [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | ---------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb16-steplr-70e](rotation-pred_resnet50_8xb16-steplr-70e_in1k.py) | | | | | | |
#### Semi-Supervised Classification
- In this benchmark, the necks or heads are removed and only the backbone CNN is evaluated by appending a linear classification head. All parameters are fine-tuned.
- When training with 1% ImageNet, we find hyper-parameters especially the learning rate greatly influence the performance. Hence, we prepare a list of settings with the base learning rate from `{0.001, 0.01, 0.1}` and the learning rate multiplier for the head from `{1, 10, 100}`. We choose the best performing setting for each method. The setting of parameters are indicated in the file name. The learning rate is indicated like `1e-1`, `1e-2`, `1e-3` and the learning rate multiplier is indicated like `head1`, `head10`, `head100`.
- Please use --deterministic in this benchmark.
Please refer to the directories `configs/benchmarks/classification/imagenet/imagenet_1percent/` of 1% data and `configs/benchmarks/classification/imagenet/imagenet_10percent/` 10% data for details.
| Model | Pretrain Config | Fine-tuned Config | Top-1 (%) | Top-5 (%) |
| --------- | ---------------------------------------------------------------------------- | ----------------- | --------- | --------- |
| [model]() | [resnet50_8xb16-steplr-70e](rotation-pred_resnet50_8xb16-steplr-70e_in1k.py) | | | |
### Detection
The detection benchmarks includes 2 downstream task datasets, **Pascal VOC 2007 + 2012** and **COCO2017**. This benchmark follows the evluation protocols set up by MoCo.
#### Pascal VOC 2007 + 2012
Please refer to [faster_rcnn_r50_c4_mstrain_24k.py](../../benchmarks/mmdetection/voc0712/faster_rcnn_r50_c4_mstrain_24k.py) for details of config.
| Model | Config | mAP | AP50 |
| --------- | ---------------------------------------------------------------------------- | --- | ---- |
| [model]() | [resnet50_8xb16-steplr-70e](rotation-pred_resnet50_8xb16-steplr-70e_in1k.py) | | |
#### COCO2017
Please refer to [mask_rcnn_r50_fpn_mstrain_1x.py](../../benchmarks/mmdetection/coco/mask_rcnn_r50_fpn_mstrain_1x.py) for details of config.
| Model | Config | mAP(Box) | AP50(Box) | AP75(Box) | mAP(Mask) | AP50(Mask) | AP75(Mask) |
| --------- | ---------------------------------------------------------------------------- | -------- | --------- | --------- | --------- | ---------- | ---------- |
| [model]() | [resnet50_8xb16-steplr-70e](rotation-pred_resnet50_8xb16-steplr-70e_in1k.py) | | | | | | |
### Segmentation
The segmentation benchmarks includes 2 downstream task datasets, **Cityscapes** and **Pascal VOC 2012 + Aug**. It follows the evluation protocols set up by MMSegmentation.
#### Pascal VOC 2012 + Aug
Please refer to [file]() for details of config.
| Model | Config | mIOU |
| --------- | ---------------------------------------------------------------------------- | ---- |
| [model]() | [resnet50_8xb16-steplr-70e](rotation-pred_resnet50_8xb16-steplr-70e_in1k.py) | |
#### Cityscapes
Please refer to [file]() for details of config.
| Model | Config | mIOU |
| --------- | ---------------------------------------------------------------------------- | ---- |
| [model]() | [resnet50_8xb16-steplr-70e](rotation-pred_resnet50_8xb16-steplr-70e_in1k.py) | |

View File

@ -1,19 +1,97 @@
# SimCLR
## A Simple Framework for Contrastive Learning of Visual Representations
> [A Simple Framework for Contrastive Learning of Visual Representations](https://arxiv.org/abs/2002.05709)
<!-- [ABSTRACT] -->
<!-- [ALGORITHM] -->
## Abstract
This paper presents SimCLR: a simple framework for contrastive learning of visual representations. We simplify recently proposed contrastive self-supervised learning algorithms without requiring specialized architectures or a memory bank. In order to understand what enables the contrastive prediction tasks to learn useful representations, we systematically study the major components of our framework. We show that (1) composition of data augmentations plays a critical role in defining effective predictive tasks, (2) introducing a learnable nonlinear transformation between the representation and the contrastive loss substantially improves the quality of the learned representations, and (3) contrastive learning benefits from larger batch sizes and more training steps compared to supervised learning. By combining these findings, we are able to considerably outperform previous methods for self-supervised and semi-supervised learning on ImageNet. A linear classifier trained on self-supervised representations learned by SimCLR achieves 76.5% top-1 accuracy, which is a 7% relative improvement over previous state-of-the-art, matching the performance of a supervised ResNet-50.
<!-- [IMAGE] -->
<div align="center">
<img />
<img src="https://user-images.githubusercontent.com/36138628/149723851-cf5f309e-d891-454d-90c0-e5337e5a11ed.png" width="400" />
</div>
## Citation
## Results and Models
<!-- [ALGORITHM] -->
**Back to [model_zoo.md](https://github.com/open-mmlab/mmselfsup/blob/master/docs/en/model_zoo.md) to download models.**
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models are pre-trained on ImageNet-1k dataset.
### Classification
The classification benchmarks includes 4 downstream task datasets, **VOC**, **ImageNet**, **iNaturalist2018** and **Places205**. If not specified, the results are Top-1 (%).
#### VOC SVM / Low-shot SVM
The **Best Layer** indicates that the best results are obtained from which layers feature map. For example, if the **Best Layer** is **feature3**, its best result is obtained from the second stage of ResNet (1 for stem layer, 2-5 for 4 stage layers).
Besides, k=1 to 96 indicates the hyper-parameter of Low-shot SVM.
| Self-Supervised Config | Best Layer | SVM | k=1 | k=2 | k=4 | k=8 | k=16 | k=32 | k=64 | k=96 |
| ------------------------------------------------------------------------------------------------------------------------------------------------ | ---------- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ---- |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simclr/simclr_resnet50_8xb32-coslr-200e_in1k.py) | feature5 | 79.98 | 35.02 | 42.79 | 54.87 | 61.91 | 67.38 | 71.88 | 75.56 | 77.4 |
#### ImageNet Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_linear-8xb32-steplr-90e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/resnet50_mhead_linear-8xb32-steplr-90e_in1k.py) for details of config.
The **AvgPool** result is obtained from Linear Evaluation with GlobalAveragePooling. Please refer to [resnet50_linear-8xb512-coslr-90e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/resnet50_linear-8xb512-coslr-90e_in1k.py) for details of config.
| Self-Supervised Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| ---------------------------------------------------------------------------------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simclr/simclr_resnet50_8xb32-coslr-200e_in1k.py) | 16.29 | 31.11 | 39.99 | 55.06 | 62.91 | 62.56 |
| [resnet50_16xb256-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simclr/simclr_resnet50_16xb256-coslr-200e_in1k.py) | 15.44 | 31.47 | 41.83 | 59.44 | 66.41 | 66.66 |
#### Places205 Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_8xb32-steplr-28e_places205.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/places205/resnet50_mhead_8xb32-steplr-28e_places205.py) for details of config.
| Self-Supervised Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 |
| ------------------------------------------------------------------------------------------------------------------------------------------------ | -------- | -------- | -------- | -------- | -------- |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simclr/simclr_resnet50_8xb32-coslr-200e_in1k.py) | 20.60 | 33.62 | 38.86 | 45.25 | 50.91 |
#### ImageNet Nearest-Neighbor Classification
The results are obtained from the features after GlobalAveragePooling. Here, k=10 to 200 indicates different number of nearest neighbors.
| Self-Supervised Config | k=10 | k=20 | k=100 | k=200 |
| ------------------------------------------------------------------------------------------------------------------------------------------------ | ---- | ---- | ----- | ----- |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simclr/simclr_resnet50_8xb32-coslr-200e_in1k.py) | 47.8 | 48.4 | 46.7 | 45.2 |
### Detection
The detection benchmarks includes 2 downstream task datasets, **Pascal VOC 2007 + 2012** and **COCO2017**. This benchmark follows the evluation protocols set up by MoCo.
#### Pascal VOC 2007 + 2012
Please refer to [faster_rcnn_r50_c4_mstrain_24k_voc0712.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmdetection/voc0712/faster_rcnn_r50_c4_mstrain_24k_voc0712.py) for details of config.
| Self-Supervised Config | AP50 |
| ------------------------------------------------------------------------------------------------------------------------------------------------ | ----- |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simclr/simclr_resnet50_8xb32-coslr-200e_in1k.py) | 79.38 |
#### COCO2017
Please refer to [mask_rcnn_r50_fpn_mstrain_1x_coco.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmdetection/coco/mask_rcnn_r50_fpn_mstrain_1x_coco.py) for details of config.
| Self-Supervised Config | mAP(Box) | AP50(Box) | AP75(Box) | mAP(Mask) | AP50(Mask) | AP75(Mask) |
| ------------------------------------------------------------------------------------------------------------------------------------------------ | -------- | --------- | --------- | --------- | ---------- | ---------- |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simclr/simclr_resnet50_8xb32-coslr-200e_in1k.py) | 38.7 | 58.1 | 42.4 | 34.9 | 55.3 | 37.5 |
### Segmentation
The segmentation benchmarks includes 2 downstream task datasets, **Cityscapes** and **Pascal VOC 2012 + Aug**. It follows the evluation protocols set up by MMSegmentation.
#### Pascal VOC 2012 + Aug
Please refer to [fcn_r50-d8_512x512_20k_voc12aug.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmsegmentation/voc12aug/fcn_r50-d8_512x512_20k_voc12aug.py) for details of config.
| Self-Supervised Config | mIOU |
| ------------------------------------------------------------------------------------------------------------------------------------------------ | ----- |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simclr/simclr_resnet50_8xb32-coslr-200e_in1k.py) | 64.03 |
## Citation
```bibtex
@inproceedings{chen2020simple,
@ -23,99 +101,3 @@ This paper presents SimCLR: a simple framework for contrastive learning of visua
year={2020},
}
```
## Models and Benchmarks
[Back to model_zoo.md](../../../docs/model_zoo.md)
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models were trained on ImageNet1k dataset.
### VOC SVM / Low-shot SVM
The **Best Layer** indicates that the best results are obtained from which layers feature map. For example, if the **Best Layer** is **feature3**, its best result is obtained from the second stage of ResNet (1 for stem layer, 2-5 for 4 stage layers).
Besides, k=1 to 96 indicates the hyper-parameter of Low-shot SVM.
| Model | Config | Best Layer | SVM | k=1 | k=2 | k=4 | k=8 | k=16 | k=32 | k=64 | k=96 |
| --------- | --------------------------------------------------------------------- | ---------- | --- | --- | --- | --- | --- | ---- | ---- | ---- | ---- |
| [model]() | [resnet50_8xb32-coslr-200e](simclr_resnet50_8xb32-coslr-200e_in1k.py) | | | | | | | | | | |
### ImageNet Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_8xb32-steplr-90e.py](../../benchmarks/classification/imagenet/resnet50_mhead_8xb32-steplr-90e_in1k.py) for details of config.
The **AvgPool** result is obtained from Linear Evaluation with GlobalAveragePooling. Please refer to [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | --------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb32-coslr-200e](simclr_resnet50_8xb32-coslr-200e_in1k.py) | | | | | | |
### iNaturalist2018 Linear Evaluation
Please refer to [resnet50_mhead_8xb32-steplr-84e_inat18.py](../../benchmarks/classification/inaturalist2018/resnet50_mhead_8xb32-steplr-84e_inat18.py) and [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | --------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb32-coslr-200e](simclr_resnet50_8xb32-coslr-200e_in1k.py) | | | | | | |
### Places205 Linear Evaluation
Please refer to [resnet50_mhead_8xb32-steplr-28e_places205.py](../../benchmarks/classification/inaturalist2018/resnet50_mhead_8xb32-steplr-28e_places205.py) and [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | --------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb32-coslr-200e](simclr_resnet50_8xb32-coslr-200e_in1k.py) | | | | | | |
#### Semi-Supervised Classification
- In this benchmark, the necks or heads are removed and only the backbone CNN is evaluated by appending a linear classification head. All parameters are fine-tuned.
- When training with 1% ImageNet, we find hyper-parameters especially the learning rate greatly influence the performance. Hence, we prepare a list of settings with the base learning rate from `{0.001, 0.01, 0.1}` and the learning rate multiplier for the head from `{1, 10, 100}`. We choose the best performing setting for each method. The setting of parameters are indicated in the file name. The learning rate is indicated like `1e-1`, `1e-2`, `1e-3` and the learning rate multiplier is indicated like `head1`, `head10`, `head100`.
- Please use --deterministic in this benchmark.
Please refer to the directories `configs/benchmarks/classification/imagenet/imagenet_1percent/` of 1% data and `configs/benchmarks/classification/imagenet/imagenet_10percent/` 10% data for details.
| Model | Pretrain Config | Fine-tuned Config | Top-1 (%) | Top-5 (%) |
| --------- | --------------------------------------------------------------------- | ----------------- | --------- | --------- |
| [model]() | [resnet50_8xb32-coslr-200e](simclr_resnet50_8xb32-coslr-200e_in1k.py) | | | |
### Detection
The detection benchmarks includes 2 downstream task datasets, **Pascal VOC 2007 + 2012** and **COCO2017**. This benchmark follows the evluation protocols set up by MoCo.
#### Pascal VOC 2007 + 2012
Please refer to [faster_rcnn_r50_c4_mstrain_24k.py](../../benchmarks/mmdetection/voc0712/faster_rcnn_r50_c4_mstrain_24k.py) for details of config.
| Model | Config | mAP | AP50 |
| --------- | --------------------------------------------------------------------- | --- | ---- |
| [model]() | [resnet50_8xb32-coslr-200e](simclr_resnet50_8xb32-coslr-200e_in1k.py) | | |
#### COCO2017
Please refer to [mask_rcnn_r50_fpn_mstrain_1x.py](../../benchmarks/mmdetection/coco/mask_rcnn_r50_fpn_mstrain_1x.py) for details of config.
| Model | Config | mAP(Box) | AP50(Box) | AP75(Box) | mAP(Mask) | AP50(Mask) | AP75(Mask) |
| --------- | --------------------------------------------------------------------- | -------- | --------- | --------- | --------- | ---------- | ---------- |
| [model]() | [resnet50_8xb32-coslr-200e](simclr_resnet50_8xb32-coslr-200e_in1k.py) | | | | | | |
### Segmentation
The segmentation benchmarks includes 2 downstream task datasets, **Cityscapes** and **Pascal VOC 2012 + Aug**. It follows the evluation protocols set up by MMSegmentation.
#### Pascal VOC 2012 + Aug
Please refer to [file]() for details of config.
| Model | Config | mIOU |
| --------- | --------------------------------------------------------------------- | ---- |
| [model]() | [resnet50_8xb32-coslr-200e](simclr_resnet50_8xb32-coslr-200e_in1k.py) | |
#### Cityscapes
Please refer to [file]() for details of config.
| Model | Config | mIOU |
| --------- | --------------------------------------------------------------------- | ---- |
| [model]() | [resnet50_8xb32-coslr-200e](simclr_resnet50_8xb32-coslr-200e_in1k.py) | |

View File

@ -12,17 +12,13 @@ This paper presents SimMIM, a simple framework for masked image modeling. We sim
<img src="https://user-images.githubusercontent.com/30762564/159404597-ac6d3a44-ee59-4cdc-8f6f-506a7d1b18b6.png" width="40%"/>
</div>
## Models and Benchmarks
Here, we report the results of the model, and more results will be coming soon.
| Backbone | Pre-train epoch | Fine-tuning Top-1 | Pre-train Config | Fine-tuning Config | Download |
| :------: | :-------------: | :---------------: | :-------------------------------------------------: | :---------------------------------------------------------------------------------------: | :-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
| Swin-Base | 100 | 82.9 | [config](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simmim/simmim_swin-base_16xb128-coslr-100e_in1k-192) | [config](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/swin-base_ft-8xb256-coslr-100e_in1k) | [model](https://download.openmmlab.com/mmselfsup/simmim/simmim_swin-base_16xb128-coslr-100e_in1k-192_20220316-1d090125.pth) &#124; [log](https://download.openmmlab.com/mmselfsup/simmim/simmim_swin-base_16xb128-coslr-100e_in1k-192_20220316-1d090125.log.json) |
| Backbone | Pre-train epoch | Fine-tuning Top-1 | Pre-train Config | Fine-tuning Config | Download |
| :-------: | :-------------: | :---------------: | :-------------------------------------------------------------------------------------------------------------------------------: | :------------------------------------------------------------------------------------------------------------------------------------------: | :-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
| Swin-Base | 100 | 82.9 | [config](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simmim/simmim_swin-base_16xb128-coslr-100e_in1k-192) | [config](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/swin-base_ft-8xb256-coslr-100e_in1k) | [model](https://download.openmmlab.com/mmselfsup/simmim/simmim_swin-base_16xb128-coslr-100e_in1k-192_20220316-1d090125.pth) \| [log](https://download.openmmlab.com/mmselfsup/simmim/simmim_swin-base_16xb128-coslr-100e_in1k-192_20220316-1d090125.log.json) |
## Citation

View File

@ -1,19 +1,103 @@
# SimSiam
## Exploring Simple Siamese Representation Learning
> [Exploring Simple Siamese Representation Learning](https://arxiv.org/abs/2011.10566)
<!-- [ABSTRACT] -->
<!-- [ALGORITHM] -->
## Abstract
Siamese networks have become a common structure in various recent models for unsupervised visual representation learning. These models maximize the similarity between two augmentations of one image, subject to certain conditions for avoiding collapsing solutions. In this paper, we report surprising empirical results that simple Siamese networks can learn meaningful representations even using none of the following: (i) negative sample pairs, (ii) large batches, (iii) momentum encoders. Our experiments show that collapsing solutions do exist for the loss and structure, but a stop-gradient operation plays an essential role in preventing collapsing. We provide a hypothesis on the implication of stop-gradient, and further show proof-of-concept experiments verifying it. Our “SimSiam” method achieves competitive results on ImageNet and downstream tasks. We hope this simple baseline will motivate people to rethink the roles of Siamese architectures for unsupervised representation learning.
<!-- [IMAGE] -->
<div align="center">
<img />
<img src="https://user-images.githubusercontent.com/36138628/149724180-bc7bac6a-fcb8-421e-b8f1-9550c624d154.png" width="500" />
</div>
## Citation
## Results and Models
<!-- [ALGORITHM] -->
**Back to [model_zoo.md](https://github.com/open-mmlab/mmselfsup/blob/master/docs/en/model_zoo.md) to download models.**
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models are pre-trained on ImageNet-1k dataset.
### Classification
The classification benchmarks includes 4 downstream task datasets, **VOC**, **ImageNet**, **iNaturalist2018** and **Places205**. If not specified, the results are Top-1 (%).
#### VOC SVM / Low-shot SVM
The **Best Layer** indicates that the best results are obtained from which layers feature map. For example, if the **Best Layer** is **feature3**, its best result is obtained from the second stage of ResNet (1 for stem layer, 2-5 for 4 stage layers).
Besides, k=1 to 96 indicates the hyper-parameter of Low-shot SVM.
| Self-Supervised Config | Best Layer | SVM | k=1 | k=2 | k=4 | k=8 | k=16 | k=32 | k=64 | k=96 |
| -------------------------------------------------------------------------------------------------------------------------------------------------- | ---------- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- |
| [resnet50_8xb32-coslr-100e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simsiam/simsiam_resnet50_8xb32-coslr-100e_in1k.py) | feature5 | 84.64 | 39.65 | 49.86 | 62.48 | 69.50 | 74.48 | 78.31 | 81.06 | 82.56 |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simsiam/simsiam_resnet50_8xb32-coslr-200e_in1k.py) | feature5 | 85.20 | 39.85 | 50.44 | 63.73 | 70.93 | 75.74 | 79.42 | 82.02 | 83.44 |
#### ImageNet Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_linear-8xb32-steplr-90e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/resnet50_mhead_linear-8xb32-steplr-90e_in1k.py) for details of config.
The **AvgPool** result is obtained from Linear Evaluation with GlobalAveragePooling. Please refer to [resnet50_linear-8xb512-coslr-90e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/resnet50_linear-8xb512-coslr-90e_in1k.py) for details of config.
| Self-Supervised Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| -------------------------------------------------------------------------------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [resnet50_8xb32-coslr-100e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simsiam/simsiam_resnet50_8xb32-coslr-100e_in1k.py) | 16.27 | 33.77 | 45.80 | 60.83 | 68.21 | 68.28 |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simsiam/simsiam_resnet50_8xb32-coslr-200e_in1k.py) | 15.57 | 37.21 | 47.28 | 62.21 | 69.85 | 69.84 |
#### Places205 Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_8xb32-steplr-28e_places205.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/places205/resnet50_mhead_8xb32-steplr-28e_places205.py) for details of config.
| Self-Supervised Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 |
| -------------------------------------------------------------------------------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- |
| [resnet50_8xb32-coslr-100e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simsiam/simsiam_resnet50_8xb32-coslr-100e_in1k.py) | 21.32 | 35.66 | 43.05 | 50.79 | 53.27 |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simsiam/simsiam_resnet50_8xb32-coslr-200e_in1k.py) | 21.17 | 35.85 | 43.49 | 50.99 | 54.10 |
#### ImageNet Nearest-Neighbor Classification
The results are obtained from the features after GlobalAveragePooling. Here, k=10 to 200 indicates different number of nearest neighbors.
| Self-Supervised Config | k=10 | k=20 | k=100 | k=200 |
| -------------------------------------------------------------------------------------------------------------------------------------------------- | ---- | ---- | ----- | ----- |
| [resnet50_8xb32-coslr-100e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simsiam/simsiam_resnet50_8xb32-coslr-100e_in1k.py) | 57.4 | 57.6 | 55.8 | 54.2 |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simsiam/simsiam_resnet50_8xb32-coslr-200e_in1k.py) | 60.2 | 60.4 | 58.8 | 57.4 |
### Detection
The detection benchmarks includes 2 downstream task datasets, **Pascal VOC 2007 + 2012** and **COCO2017**. This benchmark follows the evluation protocols set up by MoCo.
#### Pascal VOC 2007 + 2012
Please refer to [faster_rcnn_r50_c4_mstrain_24k_voc0712.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmdetection/voc0712/faster_rcnn_r50_c4_mstrain_24k_voc0712.py) for details of config.
| Self-Supervised Config | AP50 |
| -------------------------------------------------------------------------------------------------------------------------------------------------- | ----- |
| [resnet50_8xb32-coslr-100e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simsiam/simsiam_resnet50_8xb32-coslr-100e_in1k.py) | 79.80 |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simsiam/simsiam_resnet50_8xb32-coslr-200e_in1k.py) | 79.85 |
#### COCO2017
Please refer to [mask_rcnn_r50_fpn_mstrain_1x_coco.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmdetection/coco/mask_rcnn_r50_fpn_mstrain_1x_coco.py) for details of config.
| Self-Supervised Config | mAP(Box) | AP50(Box) | AP75(Box) | mAP(Mask) | AP50(Mask) | AP75(Mask) |
| -------------------------------------------------------------------------------------------------------------------------------------------------- | -------- | --------- | --------- | --------- | ---------- | ---------- |
| [resnet50_8xb32-coslr-100e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simsiam/simsiam_resnet50_8xb32-coslr-100e_in1k.py) | 38.6 | 57.6 | 42.3 | 34.6 | 54.8 | 36.9 |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simsiam/simsiam_resnet50_8xb32-coslr-200e_in1k.py) | 38.8 | 58.0 | 42.3 | 34.9 | 55.3 | 37.6 |
### Segmentation
The segmentation benchmarks includes 2 downstream task datasets, **Cityscapes** and **Pascal VOC 2012 + Aug**. It follows the evluation protocols set up by MMSegmentation.
#### Pascal VOC 2012 + Aug
Please refer to [fcn_r50-d8_512x512_20k_voc12aug.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmsegmentation/voc12aug/fcn_r50-d8_512x512_20k_voc12aug.py) for details of config.
| Self-Supervised Config | mIOU |
| -------------------------------------------------------------------------------------------------------------------------------------------------- | ----- |
| [resnet50_8xb32-coslr-100e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simsiam/simsiam_resnet50_8xb32-coslr-100e_in1k.py) | 48.35 |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simsiam/simsiam_resnet50_8xb32-coslr-200e_in1k.py) | 46.27 |
## Citation
```bibtex
@inproceedings{chen2021exploring,
@ -23,107 +107,3 @@ Siamese networks have become a common structure in various recent models for uns
year={2021}
}
```
## Models and Benchmarks
[Back to model_zoo.md](../../../docs/model_zoo.md)
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models were trained on ImageNet1k dataset.
### VOC SVM / Low-shot SVM
The **Best Layer** indicates that the best results are obtained from which layers feature map. For example, if the **Best Layer** is **feature3**, its best result is obtained from the second stage of ResNet (1 for stem layer, 2-5 for 4 stage layers).
Besides, k=1 to 96 indicates the hyper-parameter of Low-shot SVM.
| Model | Config | Best Layer | SVM | k=1 | k=2 | k=4 | k=8 | k=16 | k=32 | k=64 | k=96 |
| --------- | ---------------------------------------------------------------------- | ---------- | --- | --- | --- | --- | --- | ---- | ---- | ---- | ---- |
| [model]() | [resnet50_8xb32-coslr-100e](simsiam_resnet50_8xb32-coslr-100e_in1k.py) | | | | | | | | | | |
| [model]() | [resnet50_8xb32-coslr-200e](simsiam_resnet50_8xb32-coslr-200e_in1k.py) | | | | | | | | | | |
### ImageNet Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_8xb32-steplr-90e.py](../../benchmarks/classification/imagenet/resnet50_mhead_8xb32-steplr-90e_in1k.py) for details of config.
The **AvgPool** result is obtained from Linear Evaluation with GlobalAveragePooling. Please refer to [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | ---------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb32-coslr-100e](simsiam_resnet50_8xb32-coslr-100e_in1k.py) | | | | | | |
| [model]() | [resnet50_8xb32-coslr-200e](simsiam_resnet50_8xb32-coslr-200e_in1k.py) | | | | | | |
### iNaturalist2018 Linear Evaluation
Please refer to [resnet50_mhead_8xb32-steplr-84e_inat18.py](../../benchmarks/classification/inaturalist2018/resnet50_mhead_8xb32-steplr-84e_inat18.py) and [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | ---------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb32-coslr-100e](simsiam_resnet50_8xb32-coslr-100e_in1k.py) | | | | | | |
| [model]() | [resnet50_8xb32-coslr-200e](simsiam_resnet50_8xb32-coslr-200e_in1k.py) | | | | | | |
### Places205 Linear Evaluation
Please refer to [resnet50_mhead_8xb32-steplr-28e_places205.py](../../benchmarks/classification/inaturalist2018/resnet50_mhead_8xb32-steplr-28e_places205.py) and [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | ---------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb32-coslr-100e](simsiam_resnet50_8xb32-coslr-100e_in1k.py) | | | | | | |
| [model]() | [resnet50_8xb32-coslr-200e](simsiam_resnet50_8xb32-coslr-200e_in1k.py) | | | | | | |
#### Semi-Supervised Classification
- In this benchmark, the necks or heads are removed and only the backbone CNN is evaluated by appending a linear classification head. All parameters are fine-tuned.
- When training with 1% ImageNet, we find hyper-parameters especially the learning rate greatly influence the performance. Hence, we prepare a list of settings with the base learning rate from `{0.001, 0.01, 0.1}` and the learning rate multiplier for the head from `{1, 10, 100}`. We choose the best performing setting for each method. The setting of parameters are indicated in the file name. The learning rate is indicated like `1e-1`, `1e-2`, `1e-3` and the learning rate multiplier is indicated like `head1`, `head10`, `head100`.
- Please use --deterministic in this benchmark.
Please refer to the directories `configs/benchmarks/classification/imagenet/imagenet_1percent/` of 1% data and `configs/benchmarks/classification/imagenet/imagenet_10percent/` 10% data for details.
| Model | Pretrain Config | Fine-tuned Config | Top-1 (%) | Top-5 (%) |
| --------- | ----------------------------------------------------------------------------------- | ----------------- | --------- | --------- |
| [model]() | [resnet50_8xb32-accum16-coslr-200e](byol_resnet50_8xb32-accum16-coslr-200e_in1k.py) | | | |
### Detection
The detection benchmarks includes 2 downstream task datasets, **Pascal VOC 2007 + 2012** and **COCO2017**. This benchmark follows the evluation protocols set up by MoCo.
#### Pascal VOC 2007 + 2012
Please refer to [faster_rcnn_r50_c4_mstrain_24k.py](../../benchmarks/mmdetection/voc0712/faster_rcnn_r50_c4_mstrain_24k.py) for details of config.
| Model | Config | mAP | AP50 |
| --------- | ---------------------------------------------------------------------- | --- | ---- |
| [model]() | [resnet50_8xb32-coslr-100e](simsiam_resnet50_8xb32-coslr-100e_in1k.py) | | |
| [model]() | [resnet50_8xb32-coslr-200e](simsiam_resnet50_8xb32-coslr-200e_in1k.py) | | |
#### COCO2017
Please refer to [mask_rcnn_r50_fpn_mstrain_1x.py](../../benchmarks/mmdetection/coco/mask_rcnn_r50_fpn_mstrain_1x.py) for details of config.
| Model | Config | mAP(Box) | AP50(Box) | AP75(Box) | mAP(Mask) | AP50(Mask) | AP75(Mask) |
| --------- | ---------------------------------------------------------------------- | -------- | --------- | --------- | --------- | ---------- | ---------- |
| [model]() | [resnet50_8xb32-coslr-100e](simsiam_resnet50_8xb32-coslr-100e_in1k.py) | | | | | | |
| [model]() | [resnet50_8xb32-coslr-200e](simsiam_resnet50_8xb32-coslr-200e_in1k.py) | | | | | | |
### Segmentation
The segmentation benchmarks includes 2 downstream task datasets, **Cityscapes** and **Pascal VOC 2012 + Aug**. It follows the evluation protocols set up by MMSegmentation.
#### Pascal VOC 2012 + Aug
Please refer to [file]() for details of config.
| Model | Config | mIOU |
| --------- | ---------------------------------------------------------------------- | ---- |
| [model]() | [resnet50_8xb32-coslr-100e](simsiam_resnet50_8xb32-coslr-100e_in1k.py) | |
| [model]() | [resnet50_8xb32-coslr-200e](simsiam_resnet50_8xb32-coslr-200e_in1k.py) | |
#### Cityscapes
Please refer to [file]() for details of config.
| Model | Config | mIOU |
| --------- | ---------------------------------------------------------------------- | ---- |
| [model]() | [resnet50_8xb32-coslr-100e](simsiam_resnet50_8xb32-coslr-100e_in1k.py) | |
| [model]() | [resnet50_8xb32-coslr-200e](simsiam_resnet50_8xb32-coslr-200e_in1k.py) | |

View File

@ -1,19 +1,96 @@
# SwAV
## Unsupervised Learning of Visual Features by Contrasting Cluster Assignments
> [Unsupervised Learning of Visual Features by Contrasting Cluster Assignments](https://arxiv.org/abs/2006.09882)
<!-- [ABSTRACT] -->
<!-- [ALGORITHM] -->
## Abstract
Unsupervised image representations have significantly reduced the gap with supervised pretraining, notably with the recent achievements of contrastive learning methods. These contrastive methods typically work online and rely on a large number of explicit pairwise feature comparisons, which is computationally challenging. In this paper, we propose an online algorithm, SwAV, that takes advantage of contrastive methods without requiring to compute pairwise comparisons. Specifically, our method simultaneously clusters the data while enforcing consistency between cluster assignments produced for different augmentations (or “views”) of the same image, instead of comparing features directly as in contrastive learning. Simply put, we use a “swapped” prediction mechanism where we predict the code of a view from the representation of another view. Our method can be trained with large and small batches and can scale to unlimited amounts of data. Compared to previous contrastive methods, our method is more memory efficient since it does not require a large memory bank or a special momentum network. In addition, we also propose a new data augmentation strategy, multi-crop, that uses a mix of views with different resolutions in place of two full-resolution views, without increasing the memory or compute requirements.
<!-- [IMAGE] -->
<div align="center">
<img />
<img src="https://user-images.githubusercontent.com/36138628/149724517-9f1e7bdf-04c7-43e3-92f4-2b8fc1399123.png" width="500" />
</div>
## Citation
## Results and Models
<!-- [ALGORITHM] -->
**Back to [model_zoo.md](https://github.com/open-mmlab/mmselfsup/blob/master/docs/en/model_zoo.md) to download models.**
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models are pre-trained on ImageNet-1k dataset.
### Classification
The classification benchmarks includes 4 downstream task datasets, **VOC**, **ImageNet**, **iNaturalist2018** and **Places205**. If not specified, the results are Top-1 (%).
#### VOC SVM / Low-shot SVM
The **Best Layer** indicates that the best results are obtained from which layers feature map. For example, if the **Best Layer** is **feature3**, its best result is obtained from the second stage of ResNet (1 for stem layer, 2-5 for 4 stage layers).
Besides, k=1 to 96 indicates the hyper-parameter of Low-shot SVM.
| Self-Supervised Config | Best Layer | SVM | k=1 | k=2 | k=4 | k=8 | k=16 | k=32 | k=64 | k=96 |
| ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- |
| [resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/swav/swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96.py) | feature5 | 87.00 | 44.68 | 55.41 | 67.64 | 73.67 | 78.14 | 81.58 | 83.98 | 85.15 |
#### ImageNet Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_linear-8xb32-steplr-90e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/resnet50_mhead_linear-8xb32-steplr-90e_in1k.py) for details of config.
The **AvgPool** result is obtained from Linear Evaluation with GlobalAveragePooling. Please refer to [resnet50_linear-8xb32-coslr-100e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/resnet50_linear-8xb32-coslr-100e_in1k.py) for details of config.
| Self-Supervised Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/swav/swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96.py) | 16.98 | 34.96 | 49.26 | 65.98 | 70.74 | 70.47 |
#### Places205 Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_8xb32-steplr-28e_places205.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/places205/resnet50_mhead_8xb32-steplr-28e_places205.py) for details of config.
| Self-Supervised Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 |
| ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- |
| [resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/swav/swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96.py) | 23.33 | 35.45 | 43.13 | 51.98 | 55.09 |
#### ImageNet Nearest-Neighbor Classification
The results are obtained from the features after GlobalAveragePooling. Here, k=10 to 200 indicates different number of nearest neighbors.
| Self-Supervised Config | k=10 | k=20 | k=100 | k=200 |
| ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---- | ---- | ----- | ----- |
| [resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/swav/swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96.py) | 60.5 | 60.6 | 59.0 | 57.6 |
### Detection
The detection benchmarks includes 2 downstream task datasets, **Pascal VOC 2007 + 2012** and **COCO2017**. This benchmark follows the evluation protocols set up by MoCo.
#### Pascal VOC 2007 + 2012
Please refer to [faster_rcnn_r50_c4_mstrain_24k_voc0712.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmdetection/voc0712/faster_rcnn_r50_c4_mstrain_24k_voc0712.py) for details of config.
| Self-Supervised Config | AP50 |
| ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----- |
| [resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/swav/swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96.py) | 77.64 |
#### COCO2017
Please refer to [mask_rcnn_r50_fpn_mstrain_1x_coco.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmdetection/coco/mask_rcnn_r50_fpn_mstrain_1x_coco.py) for details of config.
| Self-Supervised Config | mAP(Box) | AP50(Box) | AP75(Box) | mAP(Mask) | AP50(Mask) | AP75(Mask) |
| ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------- | --------- | --------- | --------- | ---------- | ---------- |
| [resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/swav/swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96.py) | 40.2 | 60.5 | 43.9 | 36.3 | 57.5 | 38.8 |
### Segmentation
The segmentation benchmarks includes 2 downstream task datasets, **Cityscapes** and **Pascal VOC 2012 + Aug**. It follows the evluation protocols set up by MMSegmentation.
#### Pascal VOC 2012 + Aug
Please refer to [fcn_r50-d8_512x512_20k_voc12aug.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmsegmentation/voc12aug/fcn_r50-d8_512x512_20k_voc12aug.py) for details of config.
| Self-Supervised Config | mIOU |
| ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----- |
| [resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/swav/swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96.py) | 63.73 |
## Citation
```bibtex
@article{caron2020unsupervised,
@ -23,99 +100,3 @@ Unsupervised image representations have significantly reduced the gap with super
year={2020}
}
```
## Models and Benchmarks
[Back to model_zoo.md](../../../docs/model_zoo.md)
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models were trained on ImageNet1k dataset.
### VOC SVM / Low-shot SVM
The **Best Layer** indicates that the best results are obtained from which layers feature map. For example, if the **Best Layer** is **feature3**, its best result is obtained from the second stage of ResNet (1 for stem layer, 2-5 for 4 stage layers).
Besides, k=1 to 96 indicates the hyper-parameter of Low-shot SVM.
| Model | Config | Best Layer | SVM | k=1 | k=2 | k=4 | k=8 | k=16 | k=32 | k=64 | k=96 |
| --------- | ---------------------------------------------------------------------------------------------- | ---------- | --- | --- | --- | --- | --- | ---- | ---- | ---- | ---- |
| [model]() | [resnet50_8xb32-mcrop-2-6-coslr-200e](swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96.py) | | | | | | | | | | |
### ImageNet Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_8xb32-steplr-90e.py](../../benchmarks/classification/imagenet/resnet50_mhead_8xb32-steplr-90e_in1k.py) for details of config.
The **AvgPool** result is obtained from Linear Evaluation with GlobalAveragePooling. Please refer to [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | ---------------------------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb32-mcrop-2-6-coslr-200e](swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96.py) | | | | | | |
### iNaturalist2018 Linear Evaluation
Please refer to [resnet50_mhead_8xb32-steplr-84e_inat18.py](../../benchmarks/classification/inaturalist2018/resnet50_mhead_8xb32-steplr-84e_inat18.py) and [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | ---------------------------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb32-mcrop-2-6-coslr-200e](swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96.py) | | | | | | |
### Places205 Linear Evaluation
Please refer to [resnet50_mhead_8xb32-steplr-28e_places205.py](../../benchmarks/classification/inaturalist2018/resnet50_mhead_8xb32-steplr-28e_places205.py) and [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | ---------------------------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb32-mcrop-2-6-coslr-200e](swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96.py) | | | | | | |
#### Semi-Supervised Classification
- In this benchmark, the necks or heads are removed and only the backbone CNN is evaluated by appending a linear classification head. All parameters are fine-tuned.
- When training with 1% ImageNet, we find hyper-parameters especially the learning rate greatly influence the performance. Hence, we prepare a list of settings with the base learning rate from `{0.001, 0.01, 0.1}` and the learning rate multiplier for the head from `{1, 10, 100}`. We choose the best performing setting for each method. The setting of parameters are indicated in the file name. The learning rate is indicated like `1e-1`, `1e-2`, `1e-3` and the learning rate multiplier is indicated like `head1`, `head10`, `head100`.
- Please use --deterministic in this benchmark.
Please refer to the directories `configs/benchmarks/classification/imagenet/imagenet_1percent/` of 1% data and `configs/benchmarks/classification/imagenet/imagenet_10percent/` 10% data for details.
| Model | Pretrain Config | Fine-tuned Config | Top-1 (%) | Top-5 (%) |
| --------- | ---------------------------------------------------------------------------------------------- | ----------------- | --------- | --------- |
| [model]() | [resnet50_8xb32-mcrop-2-6-coslr-200e](swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96.py) | | | |
### Detection
The detection benchmarks includes 2 downstream task datasets, **Pascal VOC 2007 + 2012** and **COCO2017**. This benchmark follows the evluation protocols set up by MoCo.
#### Pascal VOC 2007 + 2012
Please refer to [faster_rcnn_r50_c4_mstrain_24k.py](../../benchmarks/mmdetection/voc0712/faster_rcnn_r50_c4_mstrain_24k.py) for details of config.
| Model | Config | mAP | AP50 |
| --------- | ---------------------------------------------------------------------------------------------- | --- | ---- |
| [model]() | [resnet50_8xb32-mcrop-2-6-coslr-200e](swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96.py) | | |
#### COCO2017
Please refer to [mask_rcnn_r50_fpn_mstrain_1x.py](../../benchmarks/mmdetection/coco/mask_rcnn_r50_fpn_mstrain_1x.py) for details of config.
| Model | Config | mAP(Box) | AP50(Box) | AP75(Box) | mAP(Mask) | AP50(Mask) | AP75(Mask) |
| --------- | ---------------------------------------------------------------------------------------------- | -------- | --------- | --------- | --------- | ---------- | ---------- |
| [model]() | [resnet50_8xb32-mcrop-2-6-coslr-200e](swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96.py) | | | | | | |
### Segmentation
The segmentation benchmarks includes 2 downstream task datasets, **Cityscapes** and **Pascal VOC 2012 + Aug**. It follows the evluation protocols set up by MMSegmentation.
#### Pascal VOC 2012 + Aug
Please refer to [file]() for details of config.
| Model | Config | mIOU |
| --------- | ---------------------------------------------------------------------------------------------- | ---- |
| [model]() | [resnet50_8xb32-mcrop-2-6-coslr-200e](swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96.py) | |
#### Cityscapes
Please refer to [file]() for details of config.
| Model | Config | mIOU |
| --------- | ---------------------------------------------------------------------------------------------- | ---- |
| [model]() | [resnet50_8xb32-mcrop-2-6-coslr-200e](swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96.py) | |

View File

@ -2,188 +2,245 @@
## MMSelfSup
### v0.9.1 (31/05/2022)
#### Highlight
- Update **BYOL** model and results ([#319](https://github.com/open-mmlab/mmselfsup/pull/319))
- Refine some documentation
#### New Features
- Update **BYOL** models and results ([#319](https://github.com/open-mmlab/mmselfsup/pull/319))
#### Bug Fixes
- Set qkv bias to False for cae and True for mae ([#303](https://github.com/open-mmlab/mmselfsup/pull/303))
- Fix spelling errors in MAE config ([#307](https://github.com/open-mmlab/mmselfsup/pull/307))
#### Improvements
- Change the file name of cosine annealing hook ([#304](https://github.com/open-mmlab/mmselfsup/pull/304))
- Replace markdownlint with mdformat ([#311](https://github.com/open-mmlab/mmselfsup/pull/311))
#### Docs
- Fix typo in tutotial ([#308](https://github.com/open-mmlab/mmselfsup/pull/308))
- Configure Myst-parser to parse anchor tag ([#309](https://github.com/open-mmlab/mmselfsup/pull/309))
- Update readthedocs algorithm README ([#310](https://github.com/open-mmlab/mmselfsup/pull/310))
- Rewrite install.md ([#317](https://github.com/open-mmlab/mmselfsup/pull/317))
- refine README.md file ([#318](https://github.com/open-mmlab/mmselfsup/pull/318))
### v0.9.0 (29/04/2022)
#### Highlight
* Support **CAE** ([#284](https://github.com/open-mmlab/mmselfsup/pull/284))
* Support **Barlow Twins** ([#207](https://github.com/open-mmlab/mmselfsup/pull/207))
- Support **CAE** ([#284](https://github.com/open-mmlab/mmselfsup/pull/284))
- Support **Barlow Twins** ([#207](https://github.com/open-mmlab/mmselfsup/pull/207))
#### New Features
* Support CAE ([#284](https://github.com/open-mmlab/mmselfsup/pull/284))
* Support Barlow twins ([#207](https://github.com/open-mmlab/mmselfsup/pull/207))
* Add SimMIM 192 pretrain and 224 fine-tuning results ([#280](https://github.com/open-mmlab/mmselfsup/pull/280))
* Add MAE pretrain with fp16 ([#271](https://github.com/open-mmlab/mmselfsup/pull/271))
- Support CAE ([#284](https://github.com/open-mmlab/mmselfsup/pull/284))
- Support Barlow twins ([#207](https://github.com/open-mmlab/mmselfsup/pull/207))
- Add SimMIM 192 pretrain and 224 fine-tuning results ([#280](https://github.com/open-mmlab/mmselfsup/pull/280))
- Add MAE pretrain with fp16 ([#271](https://github.com/open-mmlab/mmselfsup/pull/271))
#### Bug Fixes
* Fix args error ([#290](https://github.com/open-mmlab/mmselfsup/pull/290))
* Change imgs_per_gpu to samples_per_gpu in MAE config ([#278](https://github.com/open-mmlab/mmselfsup/pull/278))
* Avoid GPU memory leak with prefetch dataloader ([#277](https://github.com/open-mmlab/mmselfsup/pull/277))
* Fix key error bug when registering custom hooks ([#273](https://github.com/open-mmlab/mmselfsup/pull/273))
- Fix args error ([#290](https://github.com/open-mmlab/mmselfsup/pull/290))
- Change imgs_per_gpu to samples_per_gpu in MAE config ([#278](https://github.com/open-mmlab/mmselfsup/pull/278))
- Avoid GPU memory leak with prefetch dataloader ([#277](https://github.com/open-mmlab/mmselfsup/pull/277))
- Fix key error bug when registering custom hooks ([#273](https://github.com/open-mmlab/mmselfsup/pull/273))
#### Improvements
* Update SimCLR models and results ([#295](https://github.com/open-mmlab/mmselfsup/pull/295))
* Reduce memory usage while running unit test ([#291](https://github.com/open-mmlab/mmselfsup/pull/291))
* Remove pytorch1.5 test ([#288](https://github.com/open-mmlab/mmselfsup/pull/288))
* Rename linear probing config file names ([#281](https://github.com/open-mmlab/mmselfsup/pull/281))
* add unit test for apis ([#276](https://github.com/open-mmlab/mmselfsup/pull/276))
- Update SimCLR models and results ([#295](https://github.com/open-mmlab/mmselfsup/pull/295))
- Reduce memory usage while running unit test ([#291](https://github.com/open-mmlab/mmselfsup/pull/291))
- Remove pytorch1.5 test ([#288](https://github.com/open-mmlab/mmselfsup/pull/288))
- Rename linear probing config file names ([#281](https://github.com/open-mmlab/mmselfsup/pull/281))
- add unit test for apis ([#276](https://github.com/open-mmlab/mmselfsup/pull/276))
#### Docs
* Fix SimMIM config link, and add SimMIM to model_zoo ([#272](https://github.com/open-mmlab/mmselfsup/pull/272))
- Fix SimMIM config link, and add SimMIM to model_zoo ([#272](https://github.com/open-mmlab/mmselfsup/pull/272))
### v0.8.0 (31/03/2022)
#### Highlight
* Support **SimMIM** ([#239](https://github.com/open-mmlab/mmselfsup/pull/239))
* Add **KNN** benchmark, support KNN test with checkpoint and extracted backbone weights ([#243](https://github.com/open-mmlab/mmselfsup/pull/243))
* Support ImageNet-21k dataset ([#225](https://github.com/open-mmlab/mmselfsup/pull/225))
- Support **SimMIM** ([#239](https://github.com/open-mmlab/mmselfsup/pull/239))
- Add **KNN** benchmark, support KNN test with checkpoint and extracted backbone weights ([#243](https://github.com/open-mmlab/mmselfsup/pull/243))
- Support ImageNet-21k dataset ([#225](https://github.com/open-mmlab/mmselfsup/pull/225))
#### New Features
* Support SimMIM ([#239](https://github.com/open-mmlab/mmselfsup/pull/239))
* Add KNN benchmark, support KNN test with checkpoint and extracted backbone weights ([#243](https://github.com/open-mmlab/mmselfsup/pull/243))
* Support ImageNet-21k dataset ([#225](https://github.com/open-mmlab/mmselfsup/pull/225))
* Resume latest checkpoint automatically ([#245](https://github.com/open-mmlab/mmselfsup/pull/245))
- Support SimMIM ([#239](https://github.com/open-mmlab/mmselfsup/pull/239))
- Add KNN benchmark, support KNN test with checkpoint and extracted backbone weights ([#243](https://github.com/open-mmlab/mmselfsup/pull/243))
- Support ImageNet-21k dataset ([#225](https://github.com/open-mmlab/mmselfsup/pull/225))
- Resume latest checkpoint automatically ([#245](https://github.com/open-mmlab/mmselfsup/pull/245))
#### Bug Fixes
* Add seed to distributed sampler ([#250](https://github.com/open-mmlab/mmselfsup/pull/250))
* Fix positional parameter error in dist_test_svm_epoch.sh ([#260](https://github.com/open-mmlab/mmselfsup/pull/260))
* Fix 'mkdir' error in prepare_voc07_cls.sh ([#261](https://github.com/open-mmlab/mmselfsup/pull/261))
- Add seed to distributed sampler ([#250](https://github.com/open-mmlab/mmselfsup/pull/250))
- Fix positional parameter error in dist_test_svm_epoch.sh ([#260](https://github.com/open-mmlab/mmselfsup/pull/260))
- Fix 'mkdir' error in prepare_voc07_cls.sh ([#261](https://github.com/open-mmlab/mmselfsup/pull/261))
#### Improvements
* Update args format from command line ([#253](https://github.com/open-mmlab/mmselfsup/pull/253))
- Update args format from command line ([#253](https://github.com/open-mmlab/mmselfsup/pull/253))
#### Docs
* Fix command errors in 6_benchmarks.md ([#263](https://github.com/open-mmlab/mmselfsup/pull/263))
* Translate 6_benchmarks.md to Chinese ([#262](https://github.com/open-mmlab/mmselfsup/pull/262))
- Fix command errors in 6_benchmarks.md ([#263](https://github.com/open-mmlab/mmselfsup/pull/263))
- Translate 6_benchmarks.md to Chinese ([#262](https://github.com/open-mmlab/mmselfsup/pull/262))
### v0.7.0 (03/03/2022)
#### Highlight
* Support MAE ([#221](https://github.com/open-mmlab/mmselfsup/pull/221))
* Add Places205 benchmarks ([#210](https://github.com/open-mmlab/mmselfsup/pull/210))
* Add test Windows in workflows ([#215](https://github.com/open-mmlab/mmselfsup/pull/215))
- Support MAE ([#221](https://github.com/open-mmlab/mmselfsup/pull/221))
- Add Places205 benchmarks ([#210](https://github.com/open-mmlab/mmselfsup/pull/210))
- Add test Windows in workflows ([#215](https://github.com/open-mmlab/mmselfsup/pull/215))
#### New Features
* Support MAE ([#221](https://github.com/open-mmlab/mmselfsup/pull/221))
* Add Places205 benchmarks ([#210](https://github.com/open-mmlab/mmselfsup/pull/210))
- Support MAE ([#221](https://github.com/open-mmlab/mmselfsup/pull/221))
- Add Places205 benchmarks ([#210](https://github.com/open-mmlab/mmselfsup/pull/210))
#### Bug Fixes
* Fix config typos for rotation prediction and deepcluster ([#200](https://github.com/open-mmlab/mmselfsup/pull/200))
* Fix image channel bgr/rgb bug and update benchmarks ([#210](https://github.com/open-mmlab/mmselfsup/pull/210))
* Fix the bug when using prefetch under multi-view methods ([#218](https://github.com/open-mmlab/mmselfsup/pull/218))
* Fix tsne 'no init_cfg' error ([#222](https://github.com/open-mmlab/mmselfsup/pull/222))
- Fix config typos for rotation prediction and deepcluster ([#200](https://github.com/open-mmlab/mmselfsup/pull/200))
- Fix image channel bgr/rgb bug and update benchmarks ([#210](https://github.com/open-mmlab/mmselfsup/pull/210))
- Fix the bug when using prefetch under multi-view methods ([#218](https://github.com/open-mmlab/mmselfsup/pull/218))
- Fix tsne 'no init_cfg' error ([#222](https://github.com/open-mmlab/mmselfsup/pull/222))
#### Improvements
* Deprecate `imgs_per_gpu` and use `samples_per_gpu` ([#204](https://github.com/open-mmlab/mmselfsup/pull/204))
* Update the installation of MMCV ([#208](https://github.com/open-mmlab/mmselfsup/pull/208))
* Add pre-commit hook for algo-readme and copyright ([#213](https://github.com/open-mmlab/mmselfsup/pull/213))
* Add test Windows in workflows ([#215](https://github.com/open-mmlab/mmselfsup/pull/215))
- Deprecate `imgs_per_gpu` and use `samples_per_gpu` ([#204](https://github.com/open-mmlab/mmselfsup/pull/204))
- Update the installation of MMCV ([#208](https://github.com/open-mmlab/mmselfsup/pull/208))
- Add pre-commit hook for algo-readme and copyright ([#213](https://github.com/open-mmlab/mmselfsup/pull/213))
- Add test Windows in workflows ([#215](https://github.com/open-mmlab/mmselfsup/pull/215))
#### Docs
* Translate 0_config.md into Chinese ([#216](https://github.com/open-mmlab/mmselfsup/pull/216))
* Reorganizing OpenMMLab projects and update algorithms in readme ([#219](https://github.com/open-mmlab/mmselfsup/pull/219))
- Translate 0_config.md into Chinese ([#216](https://github.com/open-mmlab/mmselfsup/pull/216))
- Reorganizing OpenMMLab projects and update algorithms in readme ([#219](https://github.com/open-mmlab/mmselfsup/pull/219))
### v0.6.0 (02/02/2022)
#### Highlight
* Support vision transformer based MoCo v3 ([#194](https://github.com/open-mmlab/mmselfsup/pull/194))
* Speed up training and start time ([#181](https://github.com/open-mmlab/mmselfsup/pull/181))
* Support cpu training ([#188](https://github.com/open-mmlab/mmselfsup/pull/188))
- Support vision transformer based MoCo v3 ([#194](https://github.com/open-mmlab/mmselfsup/pull/194))
- Speed up training and start time ([#181](https://github.com/open-mmlab/mmselfsup/pull/181))
- Support cpu training ([#188](https://github.com/open-mmlab/mmselfsup/pull/188))
#### New Features
* Support vision transformer based MoCo v3 ([#194](https://github.com/open-mmlab/mmselfsup/pull/194))
* Support cpu training ([#188](https://github.com/open-mmlab/mmselfsup/pull/188))
- Support vision transformer based MoCo v3 ([#194](https://github.com/open-mmlab/mmselfsup/pull/194))
- Support cpu training ([#188](https://github.com/open-mmlab/mmselfsup/pull/188))
#### Bug Fixes
* Fix issue ([#159](https://github.com/open-mmlab/mmselfsup/issues/159), [#160](https://github.com/open-mmlab/mmselfsup/issues/160)) related bugs ([#161](https://github.com/open-mmlab/mmselfsup/pull/161))
* Fix missing prob assignment in `RandomAppliedTrans` ([#173](https://github.com/open-mmlab/mmselfsup/pull/173))
* Fix bug of showing k-means losses ([#182](https://github.com/open-mmlab/mmselfsup/pull/182))
* Fix bug in non-distributed multi-gpu training/testing ([#189](https://github.com/open-mmlab/mmselfsup/pull/189))
* Fix bug when loading cifar dataset ([#191](https://github.com/open-mmlab/mmselfsup/pull/191))
* Fix `dataset.evaluate` args bug ([#192](https://github.com/open-mmlab/mmselfsup/pull/192))
- Fix issue ([#159](https://github.com/open-mmlab/mmselfsup/issues/159), [#160](https://github.com/open-mmlab/mmselfsup/issues/160)) related bugs ([#161](https://github.com/open-mmlab/mmselfsup/pull/161))
- Fix missing prob assignment in `RandomAppliedTrans` ([#173](https://github.com/open-mmlab/mmselfsup/pull/173))
- Fix bug of showing k-means losses ([#182](https://github.com/open-mmlab/mmselfsup/pull/182))
- Fix bug in non-distributed multi-gpu training/testing ([#189](https://github.com/open-mmlab/mmselfsup/pull/189))
- Fix bug when loading cifar dataset ([#191](https://github.com/open-mmlab/mmselfsup/pull/191))
- Fix `dataset.evaluate` args bug ([#192](https://github.com/open-mmlab/mmselfsup/pull/192))
#### Improvements
* Cancel previous runs that are not completed in CI ([#145](https://github.com/open-mmlab/mmselfsup/pull/145))
* Enhance MIM function ([#152](https://github.com/open-mmlab/mmselfsup/pull/152))
* Skip CI when some specific files were changed ([#154](https://github.com/open-mmlab/mmselfsup/pull/154))
* Add `drop_last` when building eval optimizer ([#158](https://github.com/open-mmlab/mmselfsup/pull/158))
* Deprecate the support for "python setup.py test" ([#174](https://github.com/open-mmlab/mmselfsup/pull/174))
* Speed up training and start time ([#181](https://github.com/open-mmlab/mmselfsup/pull/181))
* Upgrade `isort` to 5.10.1 ([#184](https://github.com/open-mmlab/mmselfsup/pull/184))
- Cancel previous runs that are not completed in CI ([#145](https://github.com/open-mmlab/mmselfsup/pull/145))
- Enhance MIM function ([#152](https://github.com/open-mmlab/mmselfsup/pull/152))
- Skip CI when some specific files were changed ([#154](https://github.com/open-mmlab/mmselfsup/pull/154))
- Add `drop_last` when building eval optimizer ([#158](https://github.com/open-mmlab/mmselfsup/pull/158))
- Deprecate the support for "python setup.py test" ([#174](https://github.com/open-mmlab/mmselfsup/pull/174))
- Speed up training and start time ([#181](https://github.com/open-mmlab/mmselfsup/pull/181))
- Upgrade `isort` to 5.10.1 ([#184](https://github.com/open-mmlab/mmselfsup/pull/184))
#### Docs
* Refactor the directory structure of docs ([#146](https://github.com/open-mmlab/mmselfsup/pull/146))
* Fix readthedocs ([#148](https://github.com/open-mmlab/mmselfsup/pull/148), [#149](https://github.com/open-mmlab/mmselfsup/pull/149), [#153](https://github.com/open-mmlab/mmselfsup/pull/153))
* Fix typos and dead links in some docs ([#155](https://github.com/open-mmlab/mmselfsup/pull/155), [#180](https://github.com/open-mmlab/mmselfsup/pull/180), [#195](https://github.com/open-mmlab/mmselfsup/pull/195))
* Update training logs and benchmark results in model zoo ([#157](https://github.com/open-mmlab/mmselfsup/pull/157), [#165](https://github.com/open-mmlab/mmselfsup/pull/165), [#195](https://github.com/open-mmlab/mmselfsup/pull/195))
* Update and translate some docs into Chinese ([#163](https://github.com/open-mmlab/mmselfsup/pull/163), [#164](https://github.com/open-mmlab/mmselfsup/pull/164), [#165](https://github.com/open-mmlab/mmselfsup/pull/165), [#166](https://github.com/open-mmlab/mmselfsup/pull/166), [#167](https://github.com/open-mmlab/mmselfsup/pull/167), [#168](https://github.com/open-mmlab/mmselfsup/pull/168), [#169](https://github.com/open-mmlab/mmselfsup/pull/169), [#172](https://github.com/open-mmlab/mmselfsup/pull/172), [#176](https://github.com/open-mmlab/mmselfsup/pull/176), [#178](https://github.com/open-mmlab/mmselfsup/pull/178), [#179](https://github.com/open-mmlab/mmselfsup/pull/179))
* Update algorithm README with the new format ([#177](https://github.com/open-mmlab/mmselfsup/pull/177))
- Refactor the directory structure of docs ([#146](https://github.com/open-mmlab/mmselfsup/pull/146))
- Fix readthedocs ([#148](https://github.com/open-mmlab/mmselfsup/pull/148), [#149](https://github.com/open-mmlab/mmselfsup/pull/149), [#153](https://github.com/open-mmlab/mmselfsup/pull/153))
- Fix typos and dead links in some docs ([#155](https://github.com/open-mmlab/mmselfsup/pull/155), [#180](https://github.com/open-mmlab/mmselfsup/pull/180), [#195](https://github.com/open-mmlab/mmselfsup/pull/195))
- Update training logs and benchmark results in model zoo ([#157](https://github.com/open-mmlab/mmselfsup/pull/157), [#165](https://github.com/open-mmlab/mmselfsup/pull/165), [#195](https://github.com/open-mmlab/mmselfsup/pull/195))
- Update and translate some docs into Chinese ([#163](https://github.com/open-mmlab/mmselfsup/pull/163), [#164](https://github.com/open-mmlab/mmselfsup/pull/164), [#165](https://github.com/open-mmlab/mmselfsup/pull/165), [#166](https://github.com/open-mmlab/mmselfsup/pull/166), [#167](https://github.com/open-mmlab/mmselfsup/pull/167), [#168](https://github.com/open-mmlab/mmselfsup/pull/168), [#169](https://github.com/open-mmlab/mmselfsup/pull/169), [#172](https://github.com/open-mmlab/mmselfsup/pull/172), [#176](https://github.com/open-mmlab/mmselfsup/pull/176), [#178](https://github.com/open-mmlab/mmselfsup/pull/178), [#179](https://github.com/open-mmlab/mmselfsup/pull/179))
- Update algorithm README with the new format ([#177](https://github.com/open-mmlab/mmselfsup/pull/177))
### v0.5.0 (16/12/2021)
#### Highlight
* Released with code refactor.
* Add 3 new self-supervised learning algorithms.
* Support benchmarks with MMDet and MMSeg.
* Add comprehensive documents.
- Released with code refactor.
- Add 3 new self-supervised learning algorithms.
- Support benchmarks with MMDet and MMSeg.
- Add comprehensive documents.
#### Refactor
* Merge redundant dataset files.
* Adapt to new version of MMCV and remove old version related codes.
* Inherit MMCV BaseModule.
* Optimize directory.
* Rename all config files.
- Merge redundant dataset files.
- Adapt to new version of MMCV and remove old version related codes.
- Inherit MMCV BaseModule.
- Optimize directory.
- Rename all config files.
#### New Features
* Add SwAV, SimSiam, DenseCL algorithms.
* Add t-SNE visualization tools.
* Support MMCV version fp16.
- Add SwAV, SimSiam, DenseCL algorithms.
- Add t-SNE visualization tools.
- Support MMCV version fp16.
#### Benchmarks
* More benchmarking results, including classification, detection and segmentation.
* Support some new datasets in downstream tasks.
* Launch MMDet and MMSeg training with MIM.
- More benchmarking results, including classification, detection and segmentation.
- Support some new datasets in downstream tasks.
- Launch MMDet and MMSeg training with MIM.
#### Docs
* Refactor README, getting_started, install, model_zoo files.
* Add data_prepare file.
* Add comprehensive tutorials.
- Refactor README, getting_started, install, model_zoo files.
- Add data_prepare file.
- Add comprehensive tutorials.
## OpenSelfSup (History)
### v0.3.0 (14/10/2020)
#### Highlight
* Support Mixed Precision Training
* Improvement of GaussianBlur doubles the training speed
* More benchmarking results
- Support Mixed Precision Training
- Improvement of GaussianBlur doubles the training speed
- More benchmarking results
#### Bug Fixes
* Fix bugs in moco v2, now the results are reproducible.
* Fix bugs in byol.
- Fix bugs in moco v2, now the results are reproducible.
- Fix bugs in byol.
#### New Features
* Mixed Precision Training
* Improvement of GaussianBlur doubles the training speed of MoCo V2, SimCLR, BYOL
* More benchmarking results, including Places, VOC, COCO
- Mixed Precision Training
- Improvement of GaussianBlur doubles the training speed of MoCo V2, SimCLR, BYOL
- More benchmarking results, including Places, VOC, COCO
### v0.2.0 (26/6/2020)
#### Highlights
* Support BYOL
* Support semi-supervised benchmarks
- Support BYOL
- Support semi-supervised benchmarks
#### Bug Fixes
* Fix hash id in publish_model.py
- Fix hash id in publish_model.py
#### New Features
* Support BYOL.
* Separate train and test scripts in linear/semi evaluation.
* Support semi-supevised benchmarks: benchmarks/dist_train_semi.sh.
* Move benchmarks related configs into configs/benchmarks/.
* Provide benchmarking results and model download links.
* Support updating network every several iterations.
* Support LARS optimizer with nesterov.
* Support excluding specific parameters from LARS adaptation and weight decay required in SimCLR and BYOL.
- Support BYOL.
- Separate train and test scripts in linear/semi evaluation.
- Support semi-supevised benchmarks: benchmarks/dist_train_semi.sh.
- Move benchmarks related configs into configs/benchmarks/.
- Provide benchmarking results and model download links.
- Support updating network every several iterations.
- Support LARS optimizer with nesterov.
- Support excluding specific parameters from LARS adaptation and weight decay required in SimCLR and BYOL.

View File

@ -1,69 +0,0 @@
# Contributing to OpenMMLab
All kinds of contributions are welcome, including but not limited to the following.
- Fixes (typo, bugs)
- New features and components
## Workflow
1. fork and pull the latest OpenMMLab repository (mmselfsup)
2. checkout a new branch (do not use master branch for PRs)
3. commit your changes
4. create a PR
Note: If you plan to add some new features that involve large changes, it is encouraged to open an issue for discussion first.
## Code style
### Python
We adopt [PEP8](https://www.python.org/dev/peps/pep-0008/) as the preferred code style.
We use the following tools for linting and formatting:
- [flake8](http://flake8.pycqa.org/en/latest/): A wrapper around some linter tools.
- [yapf](https://github.com/google/yapf): A formatter for Python files.
- [isort](https://github.com/timothycrosley/isort): A Python utility to sort imports.
- [markdownlint](https://github.com/markdownlint/markdownlint): A linter to check markdown files and flag style issues.
- [docformatter](https://github.com/myint/docformatter): A formatter to format docstring.
Style configurations of yapf and isort can be found in [setup.cfg](../setup.cfg).
We use [pre-commit hook](https://pre-commit.com/) that checks and formats for `flake8`, `yapf`, `isort`, `trailing whitespaces`, `markdown files`,
fixes `end-of-files`, `double-quoted-strings`, `python-encoding-pragma`, `mixed-line-ending`, sorts `requirments.txt` automatically on every commit.
The config for a pre-commit hook is stored in [.pre-commit-config](../.pre-commit-config.yaml).
After you clone the repository, you will need to install initialize pre-commit hook.
```shell
pip install -U pre-commit
```
From the repository folder
```shell
pre-commit install
```
Try the following steps to install ruby when you encounter an issue on installing markdownlint
```shell
# install rvm
curl -L https://get.rvm.io | bash -s -- --autolibs=read-fail
[[ -s "$HOME/.rvm/scripts/rvm" ]] && source "$HOME/.rvm/scripts/rvm"
rvm autolibs disable
# install ruby
rvm install 2.7.1
```
Or refer to [this repo](https://github.com/innerlee/setup) and take [`zzruby.sh`](https://github.com/innerlee/setup/blob/master/zzruby.sh) according its instruction.
After this on every commit check code linters and formatter will be enforced.
>Before you create a PR, make sure that your code lints and is formatted by yapf.
### C++ and CUDA
We follow the [Google C++ Style Guide](https://google.github.io/styleguide/cppguide.html).

View File

@ -92,5 +92,6 @@ html_css_files = ['css/readthedocs.css']
# Enable ::: for my_st
myst_enable_extensions = ['colon_fence']
myst_heading_anchors = 4
master_doc = 'index'

21
docs/en/faq.md 100644
View File

@ -0,0 +1,21 @@
# FAQ
We list some common troubles faced by many users and their corresponding solutions here. Feel free to enrich the list if you find any frequent issues and have ways to help others to solve them. If the contents here do not cover your issue, please create an issue using the [provided templates](https://github.com/open-mmlab/mmselfsup/tree/master/.github/ISSUE_TEMPLATE) and make sure you fill in all required information in the template.
## Installation
Compatible MMCV, MMClassification, MMDetection and MMSegmentation versions are shown below. Please install the correct version of them to avoid installation issues.
| MMSelfSup version | MMCV version | MMClassification version | MMSegmentation version | MMDetection version |
| :---------------: | :-----------------: | :-------------------------: | :--------------------: | :-----------------: |
| 0.9.1 (master) | mmcv-full >= 1.4.2 | mmcls >= 0.21.0 | mmseg >= 0.20.2 | mmdet >= 2.19.0 |
| 0.9.0 | mmcv-full >= 1.4.2 | mmcls >= 0.21.0 | mmseg >= 0.20.2 | mmdet >= 2.19.0 |
| 0.8.0 | mmcv-full >= 1.4.2 | mmcls >= 0.21.0 | mmseg >= 0.20.2 | mmdet >= 2.19.0 |
| 0.7.1 | mmcv-full >= 1.3.16 | mmcls >= 0.19.0, \<= 0.20.1 | mmseg >= 0.20.2 | mmdet >= 2.16.0 |
| 0.6.0 | mmcv-full >= 1.3.16 | mmcls >= 0.19.0 | mmseg >= 0.20.2 | mmdet >= 2.16.0 |
| 0.5.0 | mmcv-full >= 1.3.16 | / | mmseg >= 0.20.2 | mmdet >= 2.16.0 |
**Note:**
- You need to run `pip uninstall mmcv` first if you have mmcv installed. If mmcv and mmcv-full are both installed, there will be `ModuleNotFoundError`.
- If you still have version problem, please create an issue and provide your package versions.

View File

@ -151,7 +151,6 @@ Before you publish a model, you may want to
python tools/model_converters/publish_model.py ${INPUT_FILENAME} ${OUTPUT_FILENAME}
```
### Use t-SNE
We provide an off-the-shelf tool to visualize the quality of image representations by t-SNE.
@ -167,7 +166,6 @@ Arguments:
- `WORK_DIR`: the directory to save the results of visualization.
- `[optional arguments]`: for optional arguments, you can refer to [visualize_tsne.py](https://github.com/open-mmlab/mmselfsup/blob/master/tools/analysis_tools/visualize_tsne.py)
### Reproducibility
If you want to make your performance exactly reproducible, please switch on `--deterministic` to train the final model to be published. Note that this flag will switch off `torch.backends.cudnn.benchmark` and slow down the training speed.

View File

@ -11,7 +11,9 @@ Welcome to MMSelfSup's documentation!
:caption: Get Started
install.md
getting_started.md
prepare_data.md
get_started.md
model_zoo.md
.. toctree::
:maxdepth: 1
@ -51,8 +53,8 @@ Welcome to MMSelfSup's documentation!
:maxdepth: 1
:caption: Notes
community/CONTRIBUTING.md
changelog.md
compatibility.md
.. toctree::
:caption: Switch Language

View File

@ -1,157 +1,71 @@
# Installation
# Prerequisites
## Requirements
In this section we demonstrate how to prepare an environment with PyTorch.
- Linux (Windows is not officially supported)
- Python 3.6+
- PyTorch 1.5+
- CUDA 9.2+
- GCC 5+
- [mmcv](https://github.com/open-mmlab/mmcv) 1.4.2+
- [mmcls](https://mmclassification.readthedocs.io/en/latest/install.html) 0.21.0+
- [mmdet](https://mmdetection.readthedocs.io/en/latest/get_started.html#installation) 2.16.0+
- [mmseg](https://mmsegmentation.readthedocs.io/en/latest/get_started.html#installation) 0.20.2+
MMselfSup works on Linux (Windows and macOS are not officially supported). It requires Python 3.6+, CUDA 9.2+ and PyTorch 1.5+.
Compatible MMCV, MMClassification, MMDetection and MMSegmentation versions are shown below. Please install the correct version of them to avoid installation issues.
If you are experienced with PyTorch and have already installed it, just skip this part and jump to the [next section](#installation). Otherwise, you can follow these steps for the preparation.
| MMSelfSup version | MMCV version | MMClassification version | MMSegmentation version | MMDetection version |
| :---------------: | :-----------------: | :------------------------: | :--------------------: | :-----------------: |
| 0.9.0 (master) | mmcv-full >= 1.4.2 | mmcls >= 0.21.0 | mmseg >= 0.20.2 | mmdet >= 2.16.0 |
| 0.8.0 | mmcv-full >= 1.4.2 | mmcls >= 0.21.0 | mmseg >= 0.20.2 | mmdet >= 2.16.0 |
| 0.7.1 | mmcv-full >= 1.3.16 | mmcls >= 0.19.0, <= 0.20.1 | mmseg >= 0.20.2 | mmdet >= 2.16.0 |
| 0.6.0 | mmcv-full >= 1.3.16 | mmcls >= 0.19.0 | mmseg >= 0.20.2 | mmdet >= 2.16.0 |
| 0.5.0 | mmcv-full >= 1.3.16 | / | mmseg >= 0.20.2 | mmdet >= 2.16.0 |
**Step 0.** Download and install Miniconda from the [official website](https://docs.conda.io/en/latest/miniconda.html).
**Note:**
- You need to run `pip uninstall mmcv` first if you have mmcv installed. If mmcv and mmcv-full are both installed, there will be `ModuleNotFoundError`.
- As MMSelfSup imports some backbones from MMClassification, you need to install MMClassification before using MMSelfSup.
- If you don't run MMDetection and MMSegmentation benchmark, it is unnecessary to install them.
## Prepare environment
1. Create a conda virtual environment and activate it.
```shell
conda create -n openmmlab python=3.7 -y
conda activate openmmlab
```
2. Install PyTorch and torchvision following the [official instructions](https://pytorch.org/), e.g.,
```shell
conda install pytorch torchvision -c pytorch
```
Make sure that your compilation CUDA version and runtime CUDA version match. You can check the supported CUDA version for precompiled packages on the [PyTorch website](https://pytorch.org/).
`E.g.1` If you have CUDA 10.1 installed under `/usr/local/cuda` and would like to install PyTorch 1.7, you need to install the prebuilt PyTorch with CUDA 10.1.
```shell
conda install pytorch==1.7.0 torchvision==0.8.0 cudatoolkit=10.1 -c pytorch
```
If you build PyTorch from source instead of installing the prebuilt package, you can use more CUDA versions such as 9.0.
## Install MMSelfSup
1. Install MMCV and MMClassification
```shell
pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/{cu_version}/{torch_version}/index.html
```
Please replace `{cu_version}` and `{torch_version}` in the url to your desired one. For example, to install the latest `mmcv-full` with `CUDA 11.0` and `PyTorch 1.7.x`, use the following command:
```shell
pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu110/torch1.7/index.html
```
- mmcv-full is only compiled on PyTorch 1.x.0 because the compatibility usually holds between 1.x.0 and 1.x.1. If your PyTorch version is 1.x.1, you can install mmcv-full compiled with PyTorch 1.x.0 and it usually works well.
See [here](https://github.com/open-mmlab/mmcv#installation) for different versions of MMCV compatible to different PyTorch and CUDA versions.
Optionally you can compile mmcv from source if you need to develop both mmcv and mmselfsup. Refer to the [guide](https://github.com/open-mmlab/mmcv#installation) for details.
You can simply install MMClassification with the following command:
```shell
pip install mmcls
```
2. Clone MMSelfSup repository and install
```shell
git clone https://github.com/open-mmlab/mmselfsup.git
cd mmselfsup
pip install -v -e .
```
**Note:**
- When specifying `-e` or `develop`, MMSelfSup is installed on dev mode, any local modifications made to the code will take effect without reinstallation.
3. Install MMSegmentation and MMDetection
You can simply install MMSegmentation and MMDetection with the following command:
```shell
pip install mmsegmentation mmdet
```
In addition to installing MMSegmentation and MMDetection by pip, user can also install them by [mim](https://github.com/open-mmlab/mim).
```shell
pip install openmim
mim install mmdet
mim install mmsegmentation
```
## A from-scratch setup script
Here is a full script for setting up mmselfsup with conda.
**Step 1.** Create a conda environment and activate it.
```shell
conda create -n openmmlab python=3.7 -y
conda create --name openmmlab python=3.8 -y
conda activate openmmlab
```
conda install -c pytorch pytorch torchvision -y
**Step 2.** Install PyTorch following [official instructions](https://pytorch.org/get-started/locally/), e.g.
# install the latest mmcv
pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.7.0/index.html
On GPU platforms:
# install mmdetection mmsegmentation
pip install mmsegmentation mmdet
```shell
conda install pytorch torchvision -c pytorch
```
On CPU platforms:
```shell
conda install pytorch torchvision cpuonly -c pytorch
```
# Installation
We recommend that users follow our best practices to install MMSelfSup. However, the whole process is highly customizable. See [Customize Installation](#customized-installation) section for more information.
## Best Practices
**Step 0.** Install [MMCV](https://github.com/open-mmlab/mmcv) using [MIM](https://github.com/open-mmlab/mim).
```shell
pip install -U openmim
mim install mmcv-full
```
**Step 1.** Install MMSelfSup.
Case a: If you develop and run mmselfsup directly, install it from source:
```shell
git clone https://github.com/open-mmlab/mmselfsup.git
cd mmselfsup
pip install -v -e .
# "-v" means verbose, or more output
# "-e" means installing a project in editable mode,
# thus any local modifications made to the code will take effect without reinstallation.
```
## Another option: Docker Image
We provide a [Dockerfile](/docker/Dockerfile) to build an image.
Case b: If you use mmselfsup as a dependency or third-party package, install it with pip:
```shell
# build an image with PyTorch 1.6.0, CUDA 10.1, CUDNN 7.
docker build -f ./docker/Dockerfile --rm -t mmselfsup:torch1.10.0-cuda11.3-cudnn8 .
pip install mmselfsup
```
**Important:** Make sure you've installed the [nvidia-container-toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker).
Run the following cmd:
```shell
docker run --gpus all --shm-size=8g -it -v {DATA_DIR}:/workspace/mmselfsup/data mmselfsup:torch1.10.0-cuda11.3-cudnn8 /bin/bash
```
`{DATA_DIR}` is your local folder containing all these datasets.
## Verification
## Verify the installation
To verify whether MMSelfSup is installed correctly, we can run the following sample code to initialize a model and inference a demo image.
```py
```python
import torch
from mmselfsup.models import build_algorithm
@ -182,7 +96,104 @@ loss = model.forward_train(image, label)
The above code is supposed to run successfully upon you finish the installation.
## Using multiple MMSelfSup versions
## Customized installation
### Benchmark
The [Best Practices](#best-practices) is for basic usage, if you need to evaluate your pre-training model with some downstream tasks such as detection or segmentation, please also install [MMDetection](https://github.com/open-mmlab/mmdetection) and [MMSegmentation](https://github.com/open-mmlab/mmsegmentation).
If you don't run MMDetection and MMSegmentation benchmark, it is unnecessary to install them.
You can simply install MMDetection and MMSegmentation with the following command:
```shell
pip install mmdet mmsegmentation
```
For more details, you can check the installation page of [MMDetection](https://github.com/open-mmlab/mmdetection/blob/master/docs/en/get_started.md) and [MMSegmentation](https://github.com/open-mmlab/mmsegmentation/blob/master/docs/en/get_started.md).
### CUDA versions
When installing PyTorch, you need to specify the version of CUDA. If you are not clear on which to choose, follow our recommendations:
- For Ampere-based NVIDIA GPUs, such as GeForce 30 series and NVIDIA A100, CUDA 11 is a must.
- For older NVIDIA GPUs, CUDA 11 is backward compatible, but CUDA 10.2 offers better compatibility and is more lightweight.
Please make sure the GPU driver satisfies the minimum version requirements. See [this table](https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html#cuda-major-component-versions__table-cuda-toolkit-driver-versions) for more information.
```{note}
Installing CUDA runtime libraries is enough if you follow our best practices, because no CUDA code will be compiled locally. However if you hope to compile MMCV from source or develop other CUDA operators, you need to install the complete CUDA toolkit from NVIDIA's [website](https://developer.nvidia.com/cuda-downloads), and its version should match the CUDA version of PyTorch. i.e., the specified version of cudatoolkit in `conda install` command.
```
### Install MMCV without MIM
MMCV contains C++ and CUDA extensions, thus depending on PyTorch in a complex way. MIM solves such dependencies automatically and makes the installation easier. However, it is not a must.
To install MMCV with pip instead of MIM, please follow [MMCV installation guides](https://mmcv.readthedocs.io/en/latest/get_started/installation.html). This requires manually specifying a find-url based on PyTorch version and its CUDA version.
For example, the following command install mmcv-full built for PyTorch 1.10.x and CUDA 11.3.
```shell
pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu113/torch1.10/index.html
```
### Another option: Docker Image
We provide a [Dockerfile](/docker/Dockerfile) to build an image.
```shell
# build an image with PyTorch 1.6.0, CUDA 10.1, CUDNN 7.
docker build -f ./docker/Dockerfile --rm -t mmselfsup:torch1.10.0-cuda11.3-cudnn8 .
```
**Important:** Make sure you've installed the [nvidia-container-toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker).
Run the following cmd:
```shell
docker run --gpus all --shm-size=8g -it -v {DATA_DIR}:/workspace/mmselfsup/data mmselfsup:torch1.10.0-cuda11.3-cudnn8 /bin/bash
```
`{DATA_DIR}` is your local folder containing all these datasets.
### Install on Google Colab
[Google Colab](https://research.google.com/) usually has PyTorch installed,
thus we only need to install MMCV and MMSeflSup with the following commands.
**Step 0.** Install [MMCV](https://github.com/open-mmlab/mmcv) using [MIM](https://github.com/open-mmlab/mim).
```shell
!pip3 install openmim
!mim install mmcv-full
```
**Step 1.** Install MMSelfSup from the source.
```shell
!git clone https://github.com/open-mmlab/mmselfsup.git
%cd mmselfsup
!pip install -e .
```
**Step 2.** Verification.
```python
import mmselfsup
print(mmselfsup.__version__)
# Example output: 0.9.0
```
```{note}
Within Jupyter, the exclamation mark `!` is used to call external executables and `%cd` is a [magic command](https://ipython.readthedocs.io/en/stable/interactive/magics.html#magic-cd) to change the current working directory of Python.
```
## Trouble shooting
If you have some issues during the installation, please first view the [FAQ](faq.md) page.
You may [open an issue](https://github.com/open-mmlab/mmselfsup/issues/new/choose) on GitHub if no solution is found.
# Using multiple MMSelfSup versions
If there are more than one mmselfsup on your machine, and you want to use them alternatively, the recommended way is to create multiple conda environments and use different environments for different versions.

View File

@ -4,27 +4,28 @@ All models and part of benchmark results are recorded below.
## Pre-trained models
| Algorithm | Config | Download |
| ------------------------------------------------------------------------------------------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| [Relative Location](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/relative_loc/README.md) | [relative-loc_resnet50_8xb64-steplr-70e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/relative_loc/relative-loc_resnet50_8xb64-steplr-70e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/relative_loc/relative-loc_resnet50_8xb64-steplr-70e_in1k_20220225-84784688.pth) &#124; [log](https://download.openmmlab.com/mmselfsup/relative_loc/relative-loc_resnet50_8xb64-steplr-70e_in1k_20220211_124808.log.json) |
| [Rotation Prediction](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/rotation_pred/README.md) | [rotation-pred_resnet50_8xb16-steplr-70e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/rotation_pred/rotation-pred_resnet50_8xb16-steplr-70e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/rotation_pred/rotation-pred_resnet50_8xb16-steplr-70e_in1k_20220225-5b9f06a0.pth) &#124; [log](https://download.openmmlab.com/mmselfsup/rotation_pred/rotation-pred_resnet50_8xb16-steplr-70e_in1k_20220215_185303.log.json) |
| [DeepCluster](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/deepcluster/README.md) | [deepcluster-sobel_resnet50_8xb64-steplr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/deepcluster/deepcluster-sobel_resnet50_8xb64-steplr-200e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/deepcluster/deepcluster-sobel_resnet50_8xb64-steplr-200e_in1k-bb8681e2.pth) |
| [NPID](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/npid/README.md) | [npid_resnet50_8xb32-steplr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/npid/npid_resnet50_8xb32-steplr-200e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/npid/npid_resnet50_8xb32-steplr-200e_in1k_20220225-5fbbda2a.pth) &#124; [log](https://download.openmmlab.com/mmselfsup/npid/npid_resnet50_8xb32-steplr-200e_in1k_20220215_185513.log.json) |
| [ODC](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/odc/README.md) | [odc_resnet50_8xb64-steplr-440e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/odc/odc_resnet50_8xb64-steplr-440e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/odc/odc_resnet50_8xb64-steplr-440e_in1k_20220225-a755d9c0.pth) &#124; [log](https://download.openmmlab.com/mmselfsup/odc/odc_resnet50_8xb64-steplr-440e_in1k_20220215_235245.log.json) |
| [SimCLR](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simclr/README.md) | [simclr_resnet50_8xb32-coslr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simclr/simclr_resnet50_8xb32-coslr-200e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/simclr/simclr_resnet50_8xb32-coslr-200e_in1k_20220428-46ef6bb9.pth) &#124; [log](https://download.openmmlab.com/mmselfsup/simclr/simclr_resnet50_8xb32-coslr-200e_in1k_20220411_182427.log.json) |
| | [simclr_resnet50_16xb256-coslr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simclr/simclr_resnet50_16xb256-coslr-200e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/simclr/simclr_resnet50_16xb256-coslr-200e_in1k_20220428-8c24b063.pth) &#124; [log](https://download.openmmlab.com/mmselfsup/simclr/simclr_resnet50_16xb256-coslr-200e_in1k_20220423_205520.log.json) |
| [MoCo v2](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mocov2/README.md) | [mocov2_resnet50_8xb32-coslr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mocov2/mocov2_resnet50_8xb32-coslr-200e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/moco/mocov2_resnet50_8xb32-coslr-200e_in1k_20220225-89e03af4.pth) &#124; [log](https://download.openmmlab.com/mmselfsup/moco/mocov2_resnet50_8xb32-coslr-200e_in1k_20220210_110905.log.json) |
| [BYOL](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/byol/README.md) | [byol_resnet50_8xb32-accum16-coslr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/byol/byol_resnet50_8xb32-accum16-coslr-200e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/byol/byol_resnet50_8xb32-accum16-coslr-200e_in1k_20220225-5c8b2c2e.pth) &#124; [log](https://download.openmmlab.com/mmselfsup/byol/byol_resnet50_8xb32-accum16-coslr-200e_in1k_20220214_115709.log.json) |
| | [byol_resnet50_8xb32-accum16-coslr-300e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/byol/byol_resnet50_8xb32-accum16-coslr-300e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/byol/byol_resnet50_8xb32-accum16-coslr-300e_in1k_20220225-a0daa54a.pth) &#124; [log](https://download.openmmlab.com/mmselfsup/byol/byol_resnet50_8xb32-accum16-coslr-300e_in1k_20220210_095852.log.json) |
| [SwAV](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/swav/README.md) | [swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/swav/swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96.py) | [model](https://download.openmmlab.com/mmselfsup/swav/swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96_20220225-0497dd5d.pth) &#124; [log](https://download.openmmlab.com/mmselfsup/swav/swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96_20220211_061131.log.json) |
| [DenseCL](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/densecl/README.md) | [densecl_resnet50_8xb32-coslr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/densecl/densecl_resnet50_8xb32-coslr-200e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/densecl/densecl_resnet50_8xb32-coslr-200e_in1k_20220225-8c7808fe.pth) &#124; [log](https://download.openmmlab.com/mmselfsup/densecl/densecl_resnet50_8xb32-coslr-200e_in1k_20220215_041207.log.json) |
| [SimSiam](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simsiam/README.md) | [simsiam_resnet50_8xb32-coslr-100e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simsiam/simsiam_resnet50_8xb32-coslr-100e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/simsiam/simsiam_resnet50_8xb32-coslr-100e_in1k_20220225-68a88ad8.pth) &#124; [log](https://download.openmmlab.com/mmselfsup/simsiam/simsiam_resnet50_8xb32-coslr-100e_in1k_20220210_195405.log.json) |
| | [simsiam_resnet50_8xb32-coslr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simsiam/simsiam_resnet50_8xb32-coslr-200e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/simsiam/simsiam_resnet50_8xb32-coslr-200e_in1k_20220225-2f488143.pth) &#124; [log](https://download.openmmlab.com/mmselfsup/simsiam/simsiam_resnet50_8xb32-coslr-200e_in1k_20220210_195402.log.json) |
| [BarlowTwins](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/barlowtwins/README.md) | [barlowtwins_resnet50_8xb256-coslr-300e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/barlowtwins/barlowtwins_resnet50_8xb256-coslr-300e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/barlowtwins/barlowtwins_resnet50_8xb256-coslr-300e_in1k_20220419-5ae15f89.pth) &#124; [log](https://download.openmmlab.com/mmselfsup/barlowtwins/barlowtwins_resnet50_8xb256-coslr-300e_in1k_20220413_111555.log.json) |
| [MoCo v3](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mocov3/README.md) | [mocov3_vit-small-p16_32xb128-fp16-coslr-300e_in1k-224](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mocov3/mocov3_vit-small-p16_32xb128-fp16-coslr-300e_in1k-224.py) | [model](https://download.openmmlab.com/mmselfsup/moco/mocov3_vit-small-p16_32xb128-fp16-coslr-300e_in1k-224_20220225-e31238dd.pth) &#124; [log](https://download.openmmlab.com/mmselfsup/moco/mocov3_vit-small-p16_32xb128-fp16-coslr-300e_in1k-224_20220222_160222.log.json) |
| [MAE](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mae/README.md) | [mae_vit-base-p16_8xb512-coslr-400e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mae/mae_vit-base-p16_8xb512-coslr-400e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/mae/mae_vit-base-p16_8xb512-coslr-400e_in1k-224_20220223-85be947b.pth) &#124; [log](https://download.openmmlab.com/mmselfsup/mae/mae_vit-base-p16_8xb512-coslr-300e_in1k-224_20220210_140925.log.json) |
| [SimMIM](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simmim/README.md) | [simmim_swin-base_16xb128-coslr-100e_in1k-192](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simmim/simmim_swin-base_16xb128-coslr-100e_in1k-192.py) | [model](https://download.openmmlab.com/mmselfsup/simmim/simmim_swin-base_16xb128-coslr-100e_in1k-192_20220316-1d090125.pth) &#124; [log](https://download.openmmlab.com/mmselfsup/simmim/simmim_swin-base_16xb128-coslr-100e_in1k-192_20220316-1d090125.log.json) |
| [CAE](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simmim/README.md) | [cae_vit-base-p16_8xb256-fp16-coslr-300e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/cae/cae_vit-base-p16_8xb256-fp16-coslr-300e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/cae/cae_vit-base-p16_16xb256-coslr-300e_in1k-224_20220427-4c786349.pth) &#124; [log](https://download.openmmlab.com/mmselfsup/cae/cae_vit-base-p16_16xb256-coslr-300e_in1k-224_20220427-4c786349.log.json) |
| Algorithm | Config | Download |
| ------------------------------------------------------------------------------------------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| [Relative Location](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/relative_loc/README.md) | [relative-loc_resnet50_8xb64-steplr-70e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/relative_loc/relative-loc_resnet50_8xb64-steplr-70e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/relative_loc/relative-loc_resnet50_8xb64-steplr-70e_in1k_20220225-84784688.pth) \| [log](https://download.openmmlab.com/mmselfsup/relative_loc/relative-loc_resnet50_8xb64-steplr-70e_in1k_20220211_124808.log.json) |
| [Rotation Prediction](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/rotation_pred/README.md) | [rotation-pred_resnet50_8xb16-steplr-70e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/rotation_pred/rotation-pred_resnet50_8xb16-steplr-70e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/rotation_pred/rotation-pred_resnet50_8xb16-steplr-70e_in1k_20220225-5b9f06a0.pth) \| [log](https://download.openmmlab.com/mmselfsup/rotation_pred/rotation-pred_resnet50_8xb16-steplr-70e_in1k_20220215_185303.log.json) |
| [DeepCluster](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/deepcluster/README.md) | [deepcluster-sobel_resnet50_8xb64-steplr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/deepcluster/deepcluster-sobel_resnet50_8xb64-steplr-200e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/deepcluster/deepcluster-sobel_resnet50_8xb64-steplr-200e_in1k-bb8681e2.pth) |
| [NPID](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/npid/README.md) | [npid_resnet50_8xb32-steplr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/npid/npid_resnet50_8xb32-steplr-200e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/npid/npid_resnet50_8xb32-steplr-200e_in1k_20220225-5fbbda2a.pth) \| [log](https://download.openmmlab.com/mmselfsup/npid/npid_resnet50_8xb32-steplr-200e_in1k_20220215_185513.log.json) |
| [ODC](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/odc/README.md) | [odc_resnet50_8xb64-steplr-440e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/odc/odc_resnet50_8xb64-steplr-440e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/odc/odc_resnet50_8xb64-steplr-440e_in1k_20220225-a755d9c0.pth) \| [log](https://download.openmmlab.com/mmselfsup/odc/odc_resnet50_8xb64-steplr-440e_in1k_20220215_235245.log.json) |
| [SimCLR](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simclr/README.md) | [simclr_resnet50_8xb32-coslr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simclr/simclr_resnet50_8xb32-coslr-200e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/simclr/simclr_resnet50_8xb32-coslr-200e_in1k_20220428-46ef6bb9.pth) \| [log](https://download.openmmlab.com/mmselfsup/simclr/simclr_resnet50_8xb32-coslr-200e_in1k_20220411_182427.log.json) |
| | [simclr_resnet50_16xb256-coslr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simclr/simclr_resnet50_16xb256-coslr-200e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/simclr/simclr_resnet50_16xb256-coslr-200e_in1k_20220428-8c24b063.pth) \| [log](https://download.openmmlab.com/mmselfsup/simclr/simclr_resnet50_16xb256-coslr-200e_in1k_20220423_205520.log.json) |
| [MoCo v2](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mocov2/README.md) | [mocov2_resnet50_8xb32-coslr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mocov2/mocov2_resnet50_8xb32-coslr-200e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/moco/mocov2_resnet50_8xb32-coslr-200e_in1k_20220225-89e03af4.pth) \| [log](https://download.openmmlab.com/mmselfsup/moco/mocov2_resnet50_8xb32-coslr-200e_in1k_20220210_110905.log.json) |
| [BYOL](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/byol/README.md) | [byol_resnet50_8xb32-accum16-coslr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/byol/byol_resnet50_8xb32-accum16-coslr-200e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/byol/byol_resnet50_8xb32-accum16-coslr-200e_in1k_20220225-5c8b2c2e.pth) \| [log](https://download.openmmlab.com/mmselfsup/byol/byol_resnet50_8xb32-accum16-coslr-200e_in1k_20220214_115709.log.json) |
| | [byol_resnet50_16xb256-coslr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/byol/byol_resnet50_16xb256-coslr-200e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/byol/byol_resnet50_16xb256-coslr-200e_in1k_20220527-b6f8eedd.pth) \| [log](https://download.openmmlab.com/mmselfsup/byol/byol_resnet50_16xb256-coslr-200e_in1k_20220509_213052.log.json) |
| | [byol_resnet50_8xb32-accum16-coslr-300e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/byol/byol_resnet50_8xb32-accum16-coslr-300e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/byol/byol_resnet50_8xb32-accum16-coslr-300e_in1k_20220225-a0daa54a.pth) \| [log](https://download.openmmlab.com/mmselfsup/byol/byol_resnet50_8xb32-accum16-coslr-300e_in1k_20220210_095852.log.json) |
| [SwAV](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/swav/README.md) | [swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/swav/swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96.py) | [model](https://download.openmmlab.com/mmselfsup/swav/swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96_20220225-0497dd5d.pth) \| [log](https://download.openmmlab.com/mmselfsup/swav/swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96_20220211_061131.log.json) |
| [DenseCL](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/densecl/README.md) | [densecl_resnet50_8xb32-coslr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/densecl/densecl_resnet50_8xb32-coslr-200e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/densecl/densecl_resnet50_8xb32-coslr-200e_in1k_20220225-8c7808fe.pth) \| [log](https://download.openmmlab.com/mmselfsup/densecl/densecl_resnet50_8xb32-coslr-200e_in1k_20220215_041207.log.json) |
| [SimSiam](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simsiam/README.md) | [simsiam_resnet50_8xb32-coslr-100e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simsiam/simsiam_resnet50_8xb32-coslr-100e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/simsiam/simsiam_resnet50_8xb32-coslr-100e_in1k_20220225-68a88ad8.pth) \| [log](https://download.openmmlab.com/mmselfsup/simsiam/simsiam_resnet50_8xb32-coslr-100e_in1k_20220210_195405.log.json) |
| | [simsiam_resnet50_8xb32-coslr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simsiam/simsiam_resnet50_8xb32-coslr-200e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/simsiam/simsiam_resnet50_8xb32-coslr-200e_in1k_20220225-2f488143.pth) \| [log](https://download.openmmlab.com/mmselfsup/simsiam/simsiam_resnet50_8xb32-coslr-200e_in1k_20220210_195402.log.json) |
| [BarlowTwins](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/barlowtwins/README.md) | [barlowtwins_resnet50_8xb256-coslr-300e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/barlowtwins/barlowtwins_resnet50_8xb256-coslr-300e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/barlowtwins/barlowtwins_resnet50_8xb256-coslr-300e_in1k_20220419-5ae15f89.pth) \| [log](https://download.openmmlab.com/mmselfsup/barlowtwins/barlowtwins_resnet50_8xb256-coslr-300e_in1k_20220413_111555.log.json) |
| [MoCo v3](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mocov3/README.md) | [mocov3_vit-small-p16_32xb128-fp16-coslr-300e_in1k-224](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mocov3/mocov3_vit-small-p16_32xb128-fp16-coslr-300e_in1k-224.py) | [model](https://download.openmmlab.com/mmselfsup/moco/mocov3_vit-small-p16_32xb128-fp16-coslr-300e_in1k-224_20220225-e31238dd.pth) \| [log](https://download.openmmlab.com/mmselfsup/moco/mocov3_vit-small-p16_32xb128-fp16-coslr-300e_in1k-224_20220222_160222.log.json) |
| [MAE](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mae/README.md) | [mae_vit-base-p16_8xb512-coslr-400e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mae/mae_vit-base-p16_8xb512-coslr-400e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/mae/mae_vit-base-p16_8xb512-coslr-400e_in1k-224_20220223-85be947b.pth) \| [log](https://download.openmmlab.com/mmselfsup/mae/mae_vit-base-p16_8xb512-coslr-300e_in1k-224_20220210_140925.log.json) |
| [SimMIM](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simmim/README.md) | [simmim_swin-base_16xb128-coslr-100e_in1k-192](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simmim/simmim_swin-base_16xb128-coslr-100e_in1k-192.py) | [model](https://download.openmmlab.com/mmselfsup/simmim/simmim_swin-base_16xb128-coslr-100e_in1k-192_20220316-1d090125.pth) \| [log](https://download.openmmlab.com/mmselfsup/simmim/simmim_swin-base_16xb128-coslr-100e_in1k-192_20220316-1d090125.log.json) |
| [CAE](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simmim/README.md) | [cae_vit-base-p16_8xb256-fp16-coslr-300e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/cae/cae_vit-base-p16_8xb256-fp16-coslr-300e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/cae/cae_vit-base-p16_16xb256-coslr-300e_in1k-224_20220427-4c786349.pth) \| [log](https://download.openmmlab.com/mmselfsup/cae/cae_vit-base-p16_16xb256-coslr-300e_in1k-224_20220427-4c786349.log.json) |
Remarks:
@ -50,8 +51,9 @@ If not specified, we use linear evaluation setting from [MoCo](http://openaccess
| SimCLR | [simclr_resnet50_8xb32-coslr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simclr/simclr_resnet50_8xb32-coslr-200e_in1k.py) | SimSiam paper setting | 62.56 |
| | [simclr_resnet50_16xb256-coslr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simclr/simclr_resnet50_16xb256-coslr-200e_in1k.py) | SimSiam paper setting | 66.66 |
| MoCo v2 | [mocov2_resnet50_8xb32-coslr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mocov2/mocov2_resnet50_8xb32-coslr-200e_in1k.py) | | 67.58 |
| BYOL | [byol_resnet50_8xb32-accum16-coslr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/byol/byol_resnet50_8xb32-accum16-coslr-200e_in1k.py) | | 67.55 |
| | [byol_resnet50_8xb32-accum16-coslr-300e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/byol/byol_resnet50_8xb32-accum16-coslr-300e_in1k.py) | | 68.55 |
| BYOL | [byol_resnet50_8xb32-accum16-coslr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/byol/byol_resnet50_8xb32-accum16-coslr-200e_in1k.py) | SimSiam paper setting | 71.72 |
| | [byol_resnet50_16xb256-coslr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/byol/byol_resnet50_16xb256-coslr-200e_in1k.py) | SimSiam paper setting | 71.88 |
| | [byol_resnet50_8xb32-accum16-coslr-300e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/byol/byol_resnet50_8xb32-accum16-coslr-300e_in1k.py) | SimSiam paper setting | 72.93 |
| SwAV | [swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/swav/swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96.py) | SwAV paper setting | 70.47 |
| DenseCL | [densecl_resnet50_8xb32-coslr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/densecl/densecl_resnet50_8xb32-coslr-200e_in1k.py) | | 63.62 |
| SimSiam | [simsiam_resnet50_8xb32-coslr-100e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simsiam/simsiam_resnet50_8xb32-coslr-100e_in1k.py) | SimSiam paper setting | 68.28 |
@ -60,6 +62,7 @@ If not specified, we use linear evaluation setting from [MoCo](http://openaccess
| MoCo v3 | [mocov3_vit-small-p16_32xb128-fp16-coslr-300e_in1k-224](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mocov3/mocov3_vit-small-p16_32xb128-fp16-coslr-300e_in1k-224.py) | MoCo v3 paper setting | 73.19 |
### ImageNet Fine-tuning
| Algorithm | Config | Remarks | Top-1 (%) |
| --------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------- | --------- |
| MAE | [mae_vit-base-p16_8xb512-coslr-400e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mae/mae_vit-base-p16_8xb512-coslr-400e_in1k.py) | | 83.1 |

View File

@ -43,8 +43,8 @@ For ImageNet, it has multiple versions, but the most commonly used one is [ILSVR
1. Register an account and login to the [download page](http://www.image-net.org/download-images)
2. Find download links for ILSVRC2012 and download the following two files
- ILSVRC2012_img_train.tar (~138GB)
- ILSVRC2012_img_val.tar (~6.3GB)
- ILSVRC2012_img_train.tar (~138GB)
- ILSVRC2012_img_val.tar (~6.3GB)
3. Untar the downloaded files
4. Download meta data using this [script](https://github.com/BVLC/caffe/blob/master/data/ilsvrc12/get_ilsvrc_aux.sh)

View File

@ -36,43 +36,53 @@ We follow the below convention to name config files. Contributors are advised to
- `data info`Data information, dataset name, input size and so on, such as imagenet, cifar, etc.;
### Algorithm information
```
{algorithm}-{misc}
```
`Algorithm` means the abbreviation from the paper and its version. E.g:
- `relative-loc` : The different word is concatenated by dashes `'-'`
- `simclr`
- `mocov2`
`misc` offers some other algorithm related information. E.g.
- `npid-ensure-neg`
- `deepcluster-sobel`
### Module information
```
{backbone setting}-{neck setting}-{head_setting}
```
The module information mainly includes the backbone information. E.g:
- `resnet50`
- `vit`will be used in mocov3
Or there are some special settings which is needed to be mentioned in the config name. E.g:
- `resnet50-nofrz`: In some downstream tasksthe backbone will not froze stages while training
### Training information
Training related settingsincluding batch size, lr schedule, data augment, etc.
- Batch size, the format is `{gpu x batch_per_gpu}`like `8xb32`;
- Training recipethe methods will be arranged in the order `{pipeline aug}-{train aug}-{loss trick}-{scheduler}-{epochs}`.
E.g:
- `8xb32-mcrop-2-6-coslr-200e` : `mcrop` is proposed in SwAV named multi-croppart of pipeline. 2 and 6 means that 2 pipelines will output 2 and 6 crops correspondinglythe crop size is recorded in data information;
- `8xb32-accum16-coslr-200e` : `accum16` means the gradient will accumulate for 16 iterationsthen the weights will be updated.
### Data information
Data information contains the dataset, input size, etc. E.g:
- `in1k` : `ImageNet1k` dataset, default to use the input image size of 224x224
- `in1k-384px` : Indicates that the input image size is 384x384
- `cifar10`
@ -80,17 +90,19 @@ Data information contains the dataset, input size, etc. E.g:
- `places205`
### Config File Name Example
```
swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96.py
```
- `swav`: Algorithm information
- `resnet50`: Module information
- `8xb32-mcrop-2-6-coslr-200e`: Training information
- `8xb32`: Use 8 GPUs in totaland the batch size is 32 per GPU
- `mcrop-2-6`:Use multi-crop data augment method
- `coslr`: Use cosine learning rate scheduler
- `200e`: Train the model for 200 epoch
- `in1k-224-96`: Data informationtrain on ImageNet1k datasetthe input sizes are 224x224 and 96x96
- `swav`: Algorithm information
- `resnet50`: Module information
- `8xb32-mcrop-2-6-coslr-200e`: Training information
- `8xb32`: Use 8 GPUs in totaland the batch size is 32 per GPU
- `mcrop-2-6`:Use multi-crop data augment method
- `coslr`: Use cosine learning rate scheduler
- `200e`: Train the model for 200 epoch
- `in1k-224-96`: Data informationtrain on ImageNet1k datasetthe input sizes are 224x224 and 96x96
### Checkpoint Naming Convention
@ -114,6 +126,7 @@ You can easily build your own training config file by inherit some base config f
For easy understanding, we use MoCo v2 as a example and comment the meaning of each line. For more detaile, please refer to the API documentation.
The config file `configs/selfsup/mocov2/mocov2_resnet50_8xb32-coslr-200e_in1k.py` is displayed below.
```python
_base_ = [
'../_base_/models/mocov2.py', # model
@ -134,6 +147,7 @@ The 'type' in the configuration file is not a constructed parameter, but a class
```
`../_base_/models/mocov2.py` is the base model config for MoCo v2.
```python
model = dict(
type='MoCo', # Algorithm name
@ -158,6 +172,7 @@ model = dict(
```
`../_base_/datasets/imagenet_mocov2.py` is the base dataset config for MoCo v2.
```python
# dataset settings
data_source = 'ImageNet' # data source name
@ -210,6 +225,7 @@ data = dict(
```
`../_base_/schedules/sgd_coslr-200e_in1k.py` is the base schedule config for MoCo v2.
```python
# optimizer
optimizer = dict(
@ -232,7 +248,9 @@ runner = dict(
max_epochs=200) # Runner that runs the workflow in total max_epochs. For IterBasedRunner use `max_iters`
```
`../_base_/default_runtime.py` is the default runtime settings.
```python
# checkpoint saving
checkpoint_config = dict(interval=10) # The save interval is 10
@ -300,7 +318,6 @@ data = dict(
))
```
### Ignore some fields in the base configs
Sometimes, you need to set `_delete_=True` to ignore some domain content in the basic configuration file. You can refer to [mmcv](https://mmcv.readthedocs.io/en/latest/understand_mmcv/config.html#inherit-from-base-config-with-ignored-fields) for more instructions.
@ -319,6 +336,7 @@ model = dict(
out_channels=128,
with_avg_pool=True))
```
### Use some fields in the base configs
Sometimes, you may refer to some fields in the `_base_` config, so as to avoid duplication of definitions. You can refer to [mmcv](https://mmcv.readthedocs.io/en/latest/understand_mmcv/config.html#reference-variables-from-base) for some more instructions.
@ -376,17 +394,16 @@ When users use the script "tools/train.py" or "tools/test.py" to submit tasks or
- Update values of list/tuples.
If the value to be updated is a list or a tuple. For example, the config file normally sets `workflow=[('train', 1)]`. If you want to
change this key, you may specify `--cfg-options workflow="[(train,1),(val,1)]"`. Note that the quotation mark \" is necessary to
change this key, you may specify `--cfg-options workflow="[(train,1),(val,1)]"`. Note that the quotation mark " is necessary to
support list/tuple data types, and that **NO** white space is allowed inside the quotation marks in the specified value.
## Import user-defined modules
```{note}
This part may only be used when using other MM-codebase, like mmcls as a third party library to build your own project, and beginners can skip it.
```
You may use other MM-codebase to complete your project and create new classes of datasets, models, data enhancements, etc. in the project. In order to streamline the code, you can use MM-codebase as a third-party library, you just need to keep your own extra code and import your own custom module in the configuration files. For examples, you may refer to [OpenMMLab Algorithm Competition Project](https://github.com/zhangrui-wolf/openmmlab-competition-2021) .
You may use other MM-codebase to complete your project and create new classes of datasets, models, data enhancements, etc. in the project. In order to streamline the code, you can use MM-codebase as a third-party library, you just need to keep your own extra code and import your own custom module in the configuration files. For examples, you may refer to [OpenMMLab Algorithm Competition Project](https://github.com/zhangrui-wolf/openmmlab-competition-2021) .
Add the following code to your own configuration files:

View File

@ -3,10 +3,10 @@
In this tutorial, we introduce the basic steps to create your customized dataset:
- [Tutorial 1: Adding New Dataset](#tutorial-1-adding-new-dataset)
- [An example of customized dataset](#an-example-of-customized-dataset)
- [Creating the `DataSource`](#creating-the-datasource)
- [Creating the `Dataset`](#creating-the-dataset)
- [Modify config file](#modify-config-file)
- [An example of customized dataset](#an-example-of-customized-dataset)
- [Creating the `DataSource`](#creating-the-datasource)
- [Creating the `Dataset`](#creating-the-dataset)
- [Modify config file](#modify-config-file)
If your algorithm does not need any customized dataset, you can use these off-the-shelf datasets under [datasets](../../mmselfsup/datasets). But to use these existing datasets, you have to convert your dataset to existing dataset format.

View File

@ -79,7 +79,7 @@ Some common hooks are not registered through `custom_hooks`, they are
| `EvalHook` | LOW (70) |
| `LoggerHook(s)` | VERY_LOW (90) |
`OptimizerHook`, `MomentumUpdaterHook` and `LrUpdaterHook` have been introduced in [sehedule strategy](./4_schedule.md). `IterTimerHook` is used to record elapsed time and does not support modification.
`OptimizerHook`, `MomentumUpdaterHook` and `LrUpdaterHook` have been introduced in [schedule strategy](./4_schedule.md). `IterTimerHook` is used to record elapsed time and does not support modification.
Here we reveal how to customize `CheckpointHook`, `LoggerHooks`, and `EvalHook`.
@ -143,7 +143,6 @@ Some hooks have been already implemented in MMCV and MMClassification, they are:
- [ProfilerHook](https://github.com/open-mmlab/mmcv/blob/master/mmcv/runner/hooks/profiler.py)
- ......
If the hook is already implemented in MMCV, you can directly modify the config to use the hook as below
```python

View File

@ -12,14 +12,15 @@ In MMSelfSup, we provide many benchmarks, thus the models can be evaluated on di
- [Segmentation](#segmentation)
First, you are supposed to extract your backbone weights by `tools/model_converters/extract_backbone_weights.py`
```shell
python ./tools/model_converters/extract_backbone_weights.py {CHECKPOINT} {MODEL_FILE}
```
Arguments:
- `CHECKPOINT`: the checkpoint file of a selfsup method named as epoch_*.pth.
- `MODEL_FILE`: the output backbone weights file. If not mentioned, the `PRETRAIN` below uses this extracted model file.
- `CHECKPOINT`: the checkpoint file of a selfsup method named as epoch\_\*.pth.
- `MODEL_FILE`: the output backbone weights file. If not mentioned, the `PRETRAIN` below uses this extracted model file.
## Classification
@ -49,9 +50,10 @@ bash tools/benchmarks/classification/svm_voc07/dist_test_svm_epoch.sh ${SELFSUP_
bash tools/benchmarks/classification/svm_voc07/slurm_test_svm_epoch.sh ${PARTITION} ${JOB_NAME} ${SELFSUP_CONFIG} ${EPOCH} ${FEATURE_LIST}
```
**To test with ckpt, the code uses the epoch_*.pth file, there is no need to extract weights.**
**To test with ckpt, the code uses the epoch\_\*.pth file, there is no need to extract weights.**
Remarks:
- `${SELFSUP_CONFIG}` is the config file of the self-supervised experiment.
- `${FEATURE_LIST}` is a string to specify features from layer1 to layer5 to evaluate; e.g., if you want to evaluate layer5 only, then `FEATURE_LIST` is "feat5", if you want to evaluate all features, then `FEATURE_LIST` is "feat1 feat2 feat3 feat4 feat5" (separated by space). If left empty, the default `FEATURE_LIST` is "feat5".
- `PRETRAIN`: the pre-trained model file.
@ -71,6 +73,7 @@ bash tools/benchmarks/classification/slurm_train_linear.sh ${PARTITION} ${JOB_NA
```
Remarks:
- The default GPU number is 8. When changing GPUS, please also change `samples_per_gpu` in the config file accordingly to ensure the total batch size is 256.
- `CONFIG`: Use config files under `configs/benchmarks/classification/`. Specifically, `imagenet` (excluding `imagenet_*percent` folders), `places205` and `inaturalist2018`.
- `PRETRAIN`: the pre-trained model file.
@ -88,6 +91,7 @@ bash tools/benchmarks/classification/slurm_train_semi.sh ${PARTITION} ${JOB_NAME
```
Remarks:
- The default GPU number is 4.
- `CONFIG`: Use config files under `configs/benchmarks/classification/imagenet/`, named `imagenet_*percent` folders.
- `PRETRAIN`: the pre-trained model file.
@ -114,9 +118,10 @@ bash tools/benchmarks/classification/knn_imagenet/dist_test_knn_epoch.sh ${SELFS
bash tools/benchmarks/classification/knn_imagenet/slurm_test_knn_epoch.sh ${PARTITION} ${JOB_NAME} ${SELFSUP_CONFIG} ${EPOCH}
```
**To test with ckpt, the code uses the epoch_*.pth file, there is no need to extract weights.**
**To test with ckpt, the code uses the epoch\_\*.pth file, there is no need to extract weights.**
Remarks:
- `${SELFSUP_CONFIG}` is the config file of the self-supervised experiment.
- `PRETRAIN`: the pre-trained model file.
- if you want to change GPU numbers, you could add `GPUS_PER_NODE=4 GPUS=4` at the beginning of the command.
@ -145,6 +150,7 @@ bash tools/benchmarks/mmdetection/mim_slurm_train.sh ${PARTITION} ${CONFIG} ${PR
```
Remarks:
- `CONFIG`: Use config files under `configs/benchmarks/mmdetection/` or write your own config files
- `PRETRAIN`: the pre-trained model file.
@ -181,5 +187,6 @@ bash tools/benchmarks/mmsegmentation/mim_slurm_train.sh ${PARTITION} ${CONFIG} $
```
Remarks:
- `CONFIG`: Use config files under `configs/benchmarks/mmsegmentation/` or write your own config files
- `PRETRAIN`: the pre-trained model file.

View File

@ -1,19 +1,99 @@
# BYOL
## Bootstrap your own latent: A new approach to self-supervised Learning
> [Bootstrap your own latent: A new approach to self-supervised Learning](https://arxiv.org/abs/2006.07733)
<!-- [ABSTRACT] -->
<!-- [ALGORITHM] -->
## Abstract
**B**ootstrap **Y**our **O**wn **L**atent (BYOL) is a new approach to self-supervised image representation learning. BYOL relies on two neural networks, referred to as online and target networks, that interact and learn from each other. From an augmented view of an image, we train the online network to predict the target network representation of the same image under a different augmented view. At the same time, we update the target network with a slow-moving average of the online network.
<!-- [IMAGE] -->
<div align="center">
<img />
<img src="https://user-images.githubusercontent.com/36138628/149720208-5ffbee78-1437-44c7-9ddb-b8caab60d2c3.png" width="800" />
</div>
## Citation
## Results and Models
<!-- [ALGORITHM] -->
**Back to [model_zoo.md](https://github.com/open-mmlab/mmselfsup/blob/master/docs/en/model_zoo.md) to download models.**
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models are pre-trained on ImageNet-1k dataset.
### Classification
The classification benchmarks includes 4 downstream task datasets, **VOC**, **ImageNet**, **iNaturalist2018** and **Places205**. If not specified, the results are Top-1 (%).
#### VOC SVM / Low-shot SVM
The **Best Layer** indicates that the best results are obtained from which layers feature map. For example, if the **Best Layer** is **feature3**, its best result is obtained from the second stage of ResNet (1 for stem layer, 2-5 for 4 stage layers).
Besides, k=1 to 96 indicates the hyper-parameter of Low-shot SVM.
| Self-Supervised Config | Best Layer | SVM | k=1 | k=2 | k=4 | k=8 | k=16 | k=32 | k=64 | k=96 |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------ | ---------- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- |
| [resnet50_8xb32-accum16-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/byol/byol_resnet50_8xb32-accum16-coslr-200e_in1k.py) | feature5 | 86.31 | 45.37 | 56.83 | 68.47 | 74.12 | 78.30 | 81.53 | 83.56 | 84.73 |
#### ImageNet Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_linear-8xb32-steplr-90e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/resnet50_mhead_linear-8xb32-steplr-90e_in1k.py) for details of config.
The **AvgPool** result is obtained from Linear Evaluation with GlobalAveragePooling. Please refer to [resnet50_linear-8xb512-coslr-90e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/resnet50_linear-8xb512-coslr-90e_in1k.py) for details of config.
| Self-Supervised Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------ | -------- | -------- | -------- | -------- | -------- | ------- |
| [resnet50_8xb32-accum16-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/byol/byol_resnet50_8xb32-accum16-coslr-200e_in1k.py) | 15.16 | 35.26 | 47.77 | 63.10 | 71.21 | 71.72 |
| [resnet50_16xb256-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/byol/byol_resnet50_16xb256-coslr-200e_in1k.py) | 15.41 | 35.15 | 47.77 | 62.59 | 71.85 | 71.88 |
#### Places205 Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_8xb32-steplr-28e_places205.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/places205/resnet50_mhead_8xb32-steplr-28e_places205.py) for details of config.
| Self-Supervised Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------ | -------- | -------- | -------- | -------- | -------- |
| [resnet50_8xb32-accum16-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/byol/byol_resnet50_8xb32-accum16-coslr-200e_in1k.py) | 21.25 | 36.55 | 43.66 | 50.74 | 53.82 |
| [resnet50_8xb32-accum16-coslr-300e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/byol/byol_resnet50_8xb32-accum16-coslr-300e_in1k.py) | 21.18 | 36.68 | 43.42 | 51.04 | 54.06 |
#### ImageNet Nearest-Neighbor Classification
The results are obtained from the features after GlobalAveragePooling. Here, k=10 to 200 indicates different number of nearest neighbors.
| Self-Supervised Config | k=10 | k=20 | k=100 | k=200 |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------ | ---- | ---- | ----- | ----- |
| [resnet50_8xb32-accum16-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/byol/byol_resnet50_8xb32-accum16-coslr-200e_in1k.py) | 63.9 | 64.2 | 62.9 | 61.9 |
| [resnet50_8xb32-accum16-coslr-300e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/byol/byol_resnet50_8xb32-accum16-coslr-300e_in1k.py) | 66.1 | 66.3 | 65.2 | 64.4 |
### Detection
The detection benchmarks includes 2 downstream task datasets, **Pascal VOC 2007 + 2012** and **COCO2017**. This benchmark follows the evluation protocols set up by MoCo.
#### Pascal VOC 2007 + 2012
Please refer to [faster_rcnn_r50_c4_mstrain_24k_voc0712.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmdetection/voc0712/faster_rcnn_r50_c4_mstrain_24k_voc0712.py) for details of config.
| Self-Supervised Config | AP50 |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------ | ----- |
| [resnet50_8xb32-accum16-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/byol/byol_resnet50_8xb32-accum16-coslr-200e_in1k.py) | 80.35 |
#### COCO2017
Please refer to [mask_rcnn_r50_fpn_mstrain_1x_coco.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmdetection/coco/mask_rcnn_r50_fpn_mstrain_1x_coco.py) for details of config.
| Self-Supervised Config | mAP(Box) | AP50(Box) | AP75(Box) | mAP(Mask) | AP50(Mask) | AP75(Mask) |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------ | -------- | --------- | --------- | --------- | ---------- | ---------- |
| [resnet50_8xb32-accum16-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/byol/byol_resnet50_8xb32-accum16-coslr-200e_in1k.py) | 40.9 | 61.0 | 44.6 | 36.8 | 58.1 | 39.5 |
### Segmentation
The segmentation benchmarks includes 2 downstream task datasets, **Cityscapes** and **Pascal VOC 2012 + Aug**. It follows the evluation protocols set up by MMSegmentation.
#### Pascal VOC 2012 + Aug
Please refer to [fcn_r50-d8_512x512_20k_voc12aug.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmsegmentation/voc12aug/fcn_r50-d8_512x512_20k_voc12aug.py) for details of config.
| Self-Supervised Config | mIOU |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------ | ----- |
| [resnet50_8xb32-accum16-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/byol/byol_resnet50_8xb32-accum16-coslr-200e_in1k.py) | 67.16 |
## Citation
```bibtex
@inproceedings{grill2020bootstrap,
@ -23,103 +103,3 @@
year={2020}
}
```
## Models and Benchmarks
[Back to model_zoo.md](../../../docs/model_zoo.md)
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models were trained on ImageNet1k dataset.
### VOC SVM / Low-shot SVM
The **Best Layer** indicates that the best results are obtained from which layers feature map. For example, if the **Best Layer** is **feature3**, its best result is obtained from the second stage of ResNet (1 for stem layer, 2-5 for 4 stage layers).
Besides, k=1 to 96 indicates the hyper-parameter of Low-shot SVM.
| Model | Config setting | Best Layer | SVM | k=1 | k=2 | k=4 | k=8 | k=16 | k=32 | k=64 | k=96 |
| --------- | ----------------------------------------------------------------------------------- | ---------- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- |
| [model]() | [resnet50_8xb32-accum16-coslr-200e](byol_resnet50_8xb32-accum16-coslr-200e_in1k.py) | feature3 | 86.31 | 45.37 | 56.83 | 68.47 | 74.12 | 78.30 | 81.53 | 83.56 | 84.73 |
### Classification
The classification benchmarks includes 3 downstream task datasets, **ImageNet**, **iNaturalist2018** and **Places205**. If not specified, the results are Top-1 (%).
#### ImageNet Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_8xb32-steplr-90e.py](../../benchmarks/classification/imagenet/resnet50_mhead_8xb32-steplr-90e_in1k.py) for details of config.
The **AvgPool** result is obtained from Linear Evaluation with GlobalAveragePooling. Please refer to [{file name}]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | ----------------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb32-accum16-coslr-200e](byol_resnet50_8xb32-accum16-coslr-200e_in1k.py) | | | | | | 67.68 |
#### iNaturalist2018 Linear Evaluation
Please refer to [resnet50_mhead_8xb32-steplr-84e_inat18.py](../../benchmarks/classification/inaturalist2018/resnet50_mhead_8xb32-steplr-84e_inat18.py) and [{file name}]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | ----------------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb32-accum16-coslr-200e](byol_resnet50_8xb32-accum16-coslr-200e_in1k.py) | | | | | | |
#### Places205 Linear Evaluation
Please refer to [resnet50_mhead_8xb32-steplr-28e_places205.py](../../benchmarks/classification/inaturalist2018/resnet50_mhead_8xb32-steplr-28e_places205.py) and [{file name}]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | ----------------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb32-accum16-coslr-200e](byol_resnet50_8xb32-accum16-coslr-200e_in1k.py) | | | | | | |
#### Semi-Supervised Classification
- In this benchmark, the necks or heads are removed and only the backbone CNN is evaluated by appending a linear classification head. All parameters are fine-tuned.
- When training with 1% ImageNet, we find hyper-parameters especially the learning rate greatly influence the performance. Hence, we prepare a list of settings with the base learning rate from `{0.001, 0.01, 0.1}` and the learning rate multiplier for the head from `{1, 10, 100}`. We choose the best performing setting for each method. The setting of parameters are indicated in the file name. The learning rate is indicated like `1e-1`, `1e-2`, `1e-3` and the learning rate multiplier is indicated like `head1`, `head10`, `head100`.
- Please use --deterministic in this benchmark.
Please refer to the directories `configs/benchmarks/classification/imagenet/imagenet_1percent/` of 1% data and `configs/benchmarks/classification/imagenet/imagenet_10percent/` 10% data for details.
| Model | Pretrain Config | Fine-tuned Config | Top-1 (%) | Top-5 (%) |
| --------- | ----------------------------------------------------------------------------------- | ----------------- | --------- | --------- |
| [model]() | [resnet50_8xb32-accum16-coslr-200e](byol_resnet50_8xb32-accum16-coslr-200e_in1k.py) | | | |
### Detection
The detection benchmarks includes 2 downstream task datasets, **Pascal VOC 2007 + 2012** and **COCO2017**. This benchmark follows the evluation protocols set up by MoCo.
#### Pascal VOC 2007 + 2012
Please refer to [faster_rcnn_r50_c4_mstrain_24k.py](../../benchmarks/mmdetection/voc0712/faster_rcnn_r50_c4_mstrain_24k.py) for details of config.
| Model | Config | mAP | AP50 |
| --------- | ----------------------------------------------------------------------------------- | --- | ---- |
| [model]() | [resnet50_8xb32-accum16-coslr-200e](byol_resnet50_8xb32-accum16-coslr-200e_in1k.py) | | |
#### COCO2017
Please refer to [mask_rcnn_r50_fpn_mstrain_1x.py](../../benchmarks/mmdetection/coco/mask_rcnn_r50_fpn_mstrain_1x.py) for details of config.
| Model | Config | mAP(Box) | AP50(Box) | AP75(Box) | mAP(Mask) | AP50(Mask) | AP75(Mask) |
| --------- | ----------------------------------------------------------------------------------- | -------- | --------- | --------- | --------- | ---------- | ---------- |
| [model]() | [resnet50_8xb32-accum16-coslr-200e](byol_resnet50_8xb32-accum16-coslr-200e_in1k.py) | | | | | | |
### Segmentation
The segmentation benchmarks includes 2 downstream task datasets, **Cityscapes** and **Pascal VOC 2012 + Aug**. It follows the evluation protocols set up by MMSegmentation.
#### Pascal VOC 2012 + Aug
Please refer to [{file name}]() for details of config.
| Model | Config | mIOU |
| --------- | ----------------------------------------------------------------------------------- | ---- |
| [model]() | [resnet50_8xb32-accum16-coslr-200e](byol_resnet50_8xb32-accum16-coslr-200e_in1k.py) | |
#### Cityscapes
Please refer to [{file name}]() for details of config.
| Model | Config | mIOU |
| --------- | ----------------------------------------------------------------------------------- | ---- |
| [model]() | [resnet50_8xb32-accum16-coslr-200e](byol_resnet50_8xb32-accum16-coslr-200e_in1k.py) | |

View File

@ -12,22 +12,19 @@ We present a novel masked image modeling (MIM) approach, context autoencoder (CA
<img src="https://user-images.githubusercontent.com/30762564/165459947-6c6ef13c-0593-4765-b44e-6da0a079802a.png" width="40%"/>
</div>
## Prerequisite
Create a new folder ``cae_ckpt`` under the root directory and download the
[weights](https://download.openmmlab.com/mmselfsup/cae/dalle_encoder.pth) for ``dalle`` encoder to that folder
Create a new folder `cae_ckpt` under the root directory and download the
[weights](https://download.openmmlab.com/mmselfsup/cae/dalle_encoder.pth) for `dalle` encoder to that folder
## Models and Benchmarks
Here, we report the results of the model, which is pre-trained on ImageNet-1k
for 300 epochs, the details are below:
| Backbone | Pre-train epoch | Fine-tuning Top-1 | Pre-train Config | Fine-tuning Config | Download |
| :------: | :-------------: | :---------------: | :-------------------------------------------------: | :---------------------------------------------------------------------------------------: | :-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
| ViT-B/16 | 300 | 83.2 | [config](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/cae/cae_vit-base-p16_8xb256-fp16-coslr-300e_in1k.py) | [config](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/vit-base-p16_ft-8xb128-coslr-100e-rpe_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/cae/cae_vit-base-p16_16xb256-coslr-300e_in1k-224_20220427-4c786349.pth) &#124; [log](https://download.openmmlab.com/mmselfsup/cae/cae_vit-base-p16_16xb256-coslr-300e_in1k-224_20220427-4c786349.log.json) |
| Backbone | Pre-train epoch | Fine-tuning Top-1 | Pre-train Config | Fine-tuning Config | Download |
| :------: | :-------------: | :---------------: | :-------------------------------------------------------------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------------------------------------------------: | :-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
| ViT-B/16 | 300 | 83.2 | [config](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/cae/cae_vit-base-p16_8xb256-fp16-coslr-300e_in1k.py) | [config](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/vit-base-p16_ft-8xb128-coslr-100e-rpe_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/cae/cae_vit-base-p16_16xb256-coslr-300e_in1k-224_20220427-4c786349.pth) \| [log](https://download.openmmlab.com/mmselfsup/cae/cae_vit-base-p16_16xb256-coslr-300e_in1k-224_20220427-4c786349.log.json) |
## Citation

View File

@ -1,19 +1,56 @@
# DeepCluster
## Deep Clustering for Unsupervised Learning of Visual Features
> [Deep Clustering for Unsupervised Learning of Visual Features](https://arxiv.org/abs/1807.05520)
<!-- [ABSTRACT] -->
<!-- [ALGORITHM] -->
## Abstract
Clustering is a class of unsupervised learning methods that has been extensively applied and studied in computer vision. Little work has been done to adapt it to the end-to-end training of visual features on large scale datasets. In this work, we present DeepCluster, a clustering method that jointly learns the parameters of a neural network and the cluster assignments of the resulting features. DeepCluster iteratively groups the features with a standard clustering algorithm, k-means, and uses the subsequent assignments as supervision to update the weights of the network.
<!-- [IMAGE] -->
<div align="center">
<img />
<img src="https://user-images.githubusercontent.com/36138628/149720586-5bfd213e-0638-47fc-b48a-a16689190e17.png" width="700" />
</div>
## Citation
## Results and Models
<!-- [ALGORITHM] -->
**Back to [model_zoo.md](https://github.com/open-mmlab/mmselfsup/blob/master/docs/en/model_zoo.md) to download models.**
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models are pre-trained on ImageNet-1k dataset.
### Classification
The classification benchmarks includes 4 downstream task datasets, **VOC**, **ImageNet**, **iNaturalist2018** and **Places205**. If not specified, the results are Top-1 (%).
#### VOC SVM / Low-shot SVM
The **Best Layer** indicates that the best results are obtained from which layers feature map. For example, if the **Best Layer** is **feature3**, its best result is obtained from the second stage of ResNet (1 for stem layer, 2-5 for 4 stage layers).
Besides, k=1 to 96 indicates the hyper-parameter of Low-shot SVM.
| Self-Supervised Config | Best Layer | SVM | k=1 | k=2 | k=4 | k=8 | k=16 | k=32 | k=64 | k=96 |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ---------- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- |
| [sobel_resnet50_8xb64-steplr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/deepcluster/deepcluster-sobel_resnet50_8xb64-steplr-200e_in1k.py) | feature5 | 74.26 | 29.37 | 37.99 | 45.85 | 55.57 | 62.48 | 66.15 | 70.00 | 71.37 |
#### ImageNet Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_linear-8xb32-steplr-90e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/resnet50_mhead_linear-8xb32-steplr-90e_in1k.py) for details of config.
The **AvgPool** result is obtained from Linear Evaluation with GlobalAveragePooling. Please refer to [resnet50_linear-8xb32-steplr-100e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/resnet50_linear-8xb32-steplr-100e_in1k.py) for details of config.
| Self-Supervised Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -------- | -------- | -------- | -------- | -------- | ------- |
| [sobel_resnet50_8xb64-steplr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/deepcluster/deepcluster-sobel_resnet50_8xb64-steplr-200e_in1k.py) | 12.78 | 30.81 | 43.88 | 57.71 | 51.68 | 46.92 |
#### Places205 Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_8xb32-steplr-28e_places205.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/places205/resnet50_mhead_8xb32-steplr-28e_places205.py) for details of config.
| Self-Supervised Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -------- | -------- | -------- | -------- | -------- |
| [sobel_resnet50_8xb64-steplr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/deepcluster/deepcluster-sobel_resnet50_8xb64-steplr-200e_in1k.py) | 18.80 | 33.93 | 41.44 | 47.22 | 42.61 |
## Citation
```bibtex
@inproceedings{caron2018deep,
@ -23,99 +60,3 @@ Clustering is a class of unsupervised learning methods that has been extensively
year={2018}
}
```
## Models and Benchmarks
[Back to model_zoo.md](../../../docs/model_zoo.md)
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models were trained on ImageNet1k dataset.
### VOC SVM / Low-shot SVM
The **Best Layer** indicates that the best results are obtained from which layers feature map. For example, if the **Best Layer** is **feature3**, its best result is obtained from the second stage of ResNet (1 for stem layer, 2-5 for 4 stage layers).
Besides, k=1 to 96 indicates the hyper-parameter of Low-shot SVM.
| Model | Config | Best Layer | SVM | k=1 | k=2 | k=4 | k=8 | k=16 | k=32 | k=64 | k=96 |
| --------- | ---------------------------------------------------------------------------- | ---------- | --- | --- | --- | --- | --- | ---- | ---- | ---- | ---- |
| [model]() | [resnet50_8xb64-steplr-200e](deepcluster_resnet50_8xb64-steplr-200e_in1k.py) | | | | | | | | | | |
### ImageNet Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_8xb32-steplr-90e.py](../../benchmarks/classification/imagenet/resnet50_mhead_8xb32-steplr-90e_in1k.py) for details of config.
The **AvgPool** result is obtained from Linear Evaluation with GlobalAveragePooling. Please refer to [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | ---------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb64-steplr-200e](deepcluster_resnet50_8xb64-steplr-200e_in1k.py) | | | | | | |
### iNaturalist2018 Linear Evaluation
Please refer to [resnet50_mhead_8xb32-steplr-84e_inat18.py](../../benchmarks/classification/inaturalist2018/resnet50_mhead_8xb32-steplr-84e_inat18.py) and [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | ---------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb64-steplr-200e](deepcluster_resnet50_8xb64-steplr-200e_in1k.py) | | | | | | |
### Places205 Linear Evaluation
Please refer to [resnet50_mhead_8xb32-steplr-28e_places205.py](../../benchmarks/classification/inaturalist2018/resnet50_mhead_8xb32-steplr-28e_places205.py) and [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | ---------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb64-steplr-200e](deepcluster_resnet50_8xb64-steplr-200e_in1k.py) | | | | | | |
#### Semi-Supervised Classification
- In this benchmark, the necks or heads are removed and only the backbone CNN is evaluated by appending a linear classification head. All parameters are fine-tuned.
- When training with 1% ImageNet, we find hyper-parameters especially the learning rate greatly influence the performance. Hence, we prepare a list of settings with the base learning rate from `{0.001, 0.01, 0.1}` and the learning rate multiplier for the head from `{1, 10, 100}`. We choose the best performing setting for each method. The setting of parameters are indicated in the file name. The learning rate is indicated like `1e-1`, `1e-2`, `1e-3` and the learning rate multiplier is indicated like `head1`, `head10`, `head100`.
- Please use --deterministic in this benchmark.
Please refer to the directories `configs/benchmarks/classification/imagenet/imagenet_1percent/` of 1% data and `configs/benchmarks/classification/imagenet/imagenet_10percent/` 10% data for details.
| Model | Pretrain Config | Fine-tuned Config | Top-1 (%) | Top-5 (%) |
| --------- | ---------------------------------------------------------------------------- | ----------------- | --------- | --------- |
| [model]() | [resnet50_8xb64-steplr-200e](deepcluster_resnet50_8xb64-steplr-200e_in1k.py) | | | |
### Detection
The detection benchmarks includes 2 downstream task datasets, **Pascal VOC 2007 + 2012** and **COCO2017**. This benchmark follows the evluation protocols set up by MoCo.
#### Pascal VOC 2007 + 2012
Please refer to [faster_rcnn_r50_c4_mstrain_24k.py](../../benchmarks/mmdetection/voc0712/faster_rcnn_r50_c4_mstrain_24k.py) for details of config.
| Model | Config | mAP | AP50 |
| --------- | ---------------------------------------------------------------------------- | --- | ---- |
| [model]() | [resnet50_8xb64-steplr-200e](deepcluster_resnet50_8xb64-steplr-200e_in1k.py) | | |
#### COCO2017
Please refer to [mask_rcnn_r50_fpn_mstrain_1x.py](../../benchmarks/mmdetection/coco/mask_rcnn_r50_fpn_mstrain_1x.py) for details of config.
| Model | Config | mAP(Box) | AP50(Box) | AP75(Box) | mAP(Mask) | AP50(Mask) | AP75(Mask) |
| --------- | ---------------------------------------------------------------------------- | -------- | --------- | --------- | --------- | ---------- | ---------- |
| [model]() | [resnet50_8xb64-steplr-200e](deepcluster_resnet50_8xb64-steplr-200e_in1k.py) | | | | | | |
### Segmentation
The segmentation benchmarks includes 2 downstream task datasets, **Cityscapes** and **Pascal VOC 2012 + Aug**. It follows the evluation protocols set up by MMSegmentation.
#### Pascal VOC 2012 + Aug
Please refer to [file]() for details of config.
| Model | Config | mIOU |
| --------- | ---------------------------------------------------------------------------- | ---- |
| [model]() | [resnet50_8xb64-steplr-200e](deepcluster_resnet50_8xb64-steplr-200e_in1k.py) | |
#### Cityscapes
Please refer to [file]() for details of config.
| Model | Config | mIOU |
| --------- | ---------------------------------------------------------------------------- | ---- |
| [model]() | [resnet50_8xb64-steplr-200e](deepcluster_resnet50_8xb64-steplr-200e_in1k.py) | |

View File

@ -1,19 +1,96 @@
# DenseCL
## Dense Contrastive Learning for Self-Supervised Visual Pre-Training
> [Dense Contrastive Learning for Self-Supervised Visual Pre-Training](https://arxiv.org/abs/2011.09157)
<!-- [ABSTRACT] -->
<!-- [ALGORITHM] -->
## Abstract
To date, most existing self-supervised learning methods are designed and optimized for image classification. These pre-trained models can be sub-optimal for dense prediction tasks due to the discrepancy between image-level prediction and pixel-level prediction. To fill this gap, we aim to design an effective, dense self-supervised learning method that directly works at the level of pixels (or local features) by taking into account the correspondence between local features. We present dense contrastive learning (DenseCL), which implements self-supervised learning by optimizing a pairwise contrastive (dis)similarity loss at the pixel level between two views of input images.
<!-- [IMAGE] -->
<div align="center">
<img />
<img src="https://user-images.githubusercontent.com/36138628/149721111-bab03a6d-a30d-418e-b338-43c3689cfc65.png" width="900" />
</div>
## Citation
## Results and Models
<!-- [ALGORITHM] -->
**Back to [model_zoo.md](https://github.com/open-mmlab/mmselfsup/blob/master/docs/en/model_zoo.md) to download models.**
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models are pre-trained on ImageNet-1k dataset.
### Classification
The classification benchmarks includes 4 downstream task datasets, **VOC**, **ImageNet**, **iNaturalist2018** and **Places205**. If not specified, the results are Top-1 (%).
#### VOC SVM / Low-shot SVM
The **Best Layer** indicates that the best results are obtained from which layers feature map. For example, if the **Best Layer** is **feature3**, its best result is obtained from the second stage of ResNet (1 for stem layer, 2-5 for 4 stage layers).
Besides, k=1 to 96 indicates the hyper-parameter of Low-shot SVM.
| Self-Supervised Config | Best Layer | SVM | k=1 | k=2 | k=4 | k=8 | k=16 | k=32 | k=64 | k=96 |
| -------------------------------------------------------------------------------------------------------------------------------------------------- | ---------- | ---- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/densecl/densecl_resnet50_8xb32-coslr-200e_in1k.py) | feature5 | 82.5 | 42.68 | 50.64 | 61.74 | 68.17 | 72.99 | 76.07 | 79.19 | 80.55 |
#### ImageNet Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_linear-8xb32-steplr-90e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/resnet50_mhead_linear-8xb32-steplr-90e_in1k.py) for details of config.
The **AvgPool** result is obtained from Linear Evaluation with GlobalAveragePooling. Please refer to [resnet50_linear-8xb32-steplr-100e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/resnet50_linear-8xb32-steplr-100e_in1k.py) for details of config.
| Self-Supervised Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| -------------------------------------------------------------------------------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/densecl/densecl_resnet50_8xb32-coslr-200e_in1k.py) | 15.86 | 35.47 | 49.46 | 64.06 | 62.95 | 63.34 |
#### Places205 Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_8xb32-steplr-28e_places205.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/places205/resnet50_mhead_8xb32-steplr-28e_places205.py) for details of config.
| Self-Supervised Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 |
| -------------------------------------------------------------------------------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/densecl/densecl_resnet50_8xb32-coslr-200e_in1k.py) | 21.32 | 36.20 | 43.97 | 51.04 | 50.45 |
#### ImageNet Nearest-Neighbor Classification
The results are obtained from the features after GlobalAveragePooling. Here, k=10 to 200 indicates different number of nearest neighbors.
| Self-Supervised Config | k=10 | k=20 | k=100 | k=200 |
| -------------------------------------------------------------------------------------------------------------------------------------------------- | ---- | ---- | ----- | ----- |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/densecl/densecl_resnet50_8xb32-coslr-200e_in1k.py) | 48.2 | 48.5 | 46.8 | 45.6 |
### Detection
The detection benchmarks includes 2 downstream task datasets, **Pascal VOC 2007 + 2012** and **COCO2017**. This benchmark follows the evluation protocols set up by MoCo.
#### Pascal VOC 2007 + 2012
Please refer to [faster_rcnn_r50_c4_mstrain_24k_voc0712.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmdetection/voc0712/faster_rcnn_r50_c4_mstrain_24k_voc0712.py) for details of config.
| Self-Supervised Config | AP50 |
| -------------------------------------------------------------------------------------------------------------------------------------------------- | ----- |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/densecl/densecl_resnet50_8xb32-coslr-200e_in1k.py) | 82.14 |
#### COCO2017
Please refer to [mask_rcnn_r50_fpn_mstrain_1x_coco.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmdetection/coco/mask_rcnn_r50_fpn_mstrain_1x_coco.py) for details of config.
| Self-Supervised Config | mAP(Box) | AP50(Box) | AP75(Box) | mAP(Mask) | AP50(Mask) | AP75(Mask) |
| -------------------------------------------------------------------------------------------------------------------------------------------------- | -------- | --------- | --------- | --------- | ---------- | ---------- |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/densecl/densecl_resnet50_8xb32-coslr-200e_in1k.py) | | | | | | |
### Segmentation
The segmentation benchmarks includes 2 downstream task datasets, **Cityscapes** and **Pascal VOC 2012 + Aug**. It follows the evluation protocols set up by MMSegmentation.
#### Pascal VOC 2012 + Aug
Please refer to [fcn_r50-d8_512x512_20k_voc12aug.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmsegmentation/voc12aug/fcn_r50-d8_512x512_20k_voc12aug.py) for details of config.
| Self-Supervised Config | mIOU |
| -------------------------------------------------------------------------------------------------------------------------------------------------- | ----- |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/densecl/densecl_resnet50_8xb32-coslr-200e_in1k.py) | 69.47 |
## Citation
```bibtex
@inproceedings{wang2021dense,
@ -23,99 +100,3 @@ To date, most existing self-supervised learning methods are designed and optimiz
year={2021}
}
```
## Models and Benchmarks
[Back to model_zoo.md](../../../docs/model_zoo.md)
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models were trained on ImageNet1k dataset.
### VOC SVM / Low-shot SVM
The **Best Layer** indicates that the best results are obtained from which layers feature map. For example, if the **Best Layer** is **feature3**, its best result is obtained from the second stage of ResNet (1 for stem layer, 2-5 for 4 stage layers).
Besides, k=1 to 96 indicates the hyper-parameter of Low-shot SVM.
| Model | Config | Best Layer | SVM | k=1 | k=2 | k=4 | k=8 | k=16 | k=32 | k=64 | k=96 |
| --------- | ---------------------------------------------------------------------- | ---------- | --- | --- | --- | --- | --- | ---- | ---- | ---- | ---- |
| [model]() | [resnet50_8xb32-coslr-200e](densecl_resnet50_8xb32-coslr-200e_in1k.py) | | | | | | | | | | |
### ImageNet Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_8xb32-steplr-90e.py](../../benchmarks/classification/imagenet/resnet50_mhead_8xb32-steplr-90e_in1k.py) for details of config.
The **AvgPool** result is obtained from Linear Evaluation with GlobalAveragePooling. Please refer to [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | ---------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb32-coslr-200e](densecl_resnet50_8xb32-coslr-200e_in1k.py) | | | | | | |
### iNaturalist2018 Linear Evaluation
Please refer to [resnet50_mhead_8xb32-steplr-84e_inat18.py](../../benchmarks/classification/inaturalist2018/resnet50_mhead_8xb32-steplr-84e_inat18.py) and [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | ---------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb32-coslr-200e](densecl_resnet50_8xb32-coslr-200e_in1k.py) | | | | | | |
### Places205 Linear Evaluation
Please refer to [resnet50_mhead_8xb32-steplr-28e_places205.py](../../benchmarks/classification/inaturalist2018/resnet50_mhead_8xb32-steplr-28e_places205.py) and [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | ---------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb32-coslr-200e](densecl_resnet50_8xb32-coslr-200e_in1k.py) | | | | | | |
#### Semi-Supervised Classification
- In this benchmark, the necks or heads are removed and only the backbone CNN is evaluated by appending a linear classification head. All parameters are fine-tuned.
- When training with 1% ImageNet, we find hyper-parameters especially the learning rate greatly influence the performance. Hence, we prepare a list of settings with the base learning rate from `{0.001, 0.01, 0.1}` and the learning rate multiplier for the head from `{1, 10, 100}`. We choose the best performing setting for each method. The setting of parameters are indicated in the file name. The learning rate is indicated like `1e-1`, `1e-2`, `1e-3` and the learning rate multiplier is indicated like `head1`, `head10`, `head100`.
- Please use --deterministic in this benchmark.
Please refer to the directories `configs/benchmarks/classification/imagenet/imagenet_1percent/` of 1% data and `configs/benchmarks/classification/imagenet/imagenet_10percent/` 10% data for details.
| Model | Pretrain Config | Fine-tuned Config | Top-1 (%) | Top-5 (%) |
| --------- | ---------------------------------------------------------------------- | ----------------- | --------- | --------- |
| [model]() | [resnet50_8xb32-coslr-200e](densecl_resnet50_8xb32-coslr-200e_in1k.py) | | | |
### Detection
The detection benchmarks includes 2 downstream task datasets, **Pascal VOC 2007 + 2012** and **COCO2017**. This benchmark follows the evluation protocols set up by MoCo.
#### Pascal VOC 2007 + 2012
Please refer to [faster_rcnn_r50_c4_mstrain_24k.py](../../benchmarks/mmdetection/voc0712/faster_rcnn_r50_c4_mstrain_24k.py) for details of config.
| Model | Config | mAP | AP50 |
| --------- | ---------------------------------------------------------------------- | --- | ---- |
| [model]() | [resnet50_8xb32-coslr-200e](densecl_resnet50_8xb32-coslr-200e_in1k.py) | | |
#### COCO2017
Please refer to [mask_rcnn_r50_fpn_mstrain_1x.py](../../benchmarks/mmdetection/coco/mask_rcnn_r50_fpn_mstrain_1x.py) for details of config.
| Model | Config | mAP(Box) | AP50(Box) | AP75(Box) | mAP(Mask) | AP50(Mask) | AP75(Mask) |
| --------- | ---------------------------------------------------------------------- | -------- | --------- | --------- | --------- | ---------- | ---------- |
| [model]() | [resnet50_8xb32-coslr-200e](densecl_resnet50_8xb32-coslr-200e_in1k.py) | | | | | | |
### Segmentation
The segmentation benchmarks includes 2 downstream task datasets, **Cityscapes** and **Pascal VOC 2012 + Aug**. It follows the evluation protocols set up by MMSegmentation.
#### Pascal VOC 2012 + Aug
Please refer to [file]() for details of config.
| Model | Config | mIOU |
| --------- | ---------------------------------------------------------------------- | ---- |
| [model]() | [resnet50_8xb32-coslr-200e](densecl_resnet50_8xb32-coslr-200e_in1k.py) | |
#### Cityscapes
Please refer to [file]() for details of config.
| Model | Config | mIOU |
| --------- | ---------------------------------------------------------------------- | ---- |
| [model]() | [resnet50_8xb32-coslr-200e](densecl_resnet50_8xb32-coslr-200e_in1k.py) | |

View File

@ -27,18 +27,14 @@ methods that use only ImageNet-1K data. Transfer performance in downstream tasks
<img src="https://user-images.githubusercontent.com/30762564/150733959-2959852a-c7bd-4d3f-911f-3e8d8839fe67.png" width="40%"/>
</div>
## Models and Benchmarks
Here, we report the results of the model, which is pre-trained on ImageNet1K
for 400 epochs, the details are below:
| Backbone | Pre-train epoch | Fine-tuning Top-1 | Pre-train Config | Fine-tuning Config | Download |
| :------: | :-------------: | :---------------: | :-------------------------------------------------: | :---------------------------------------------------------------------------------------: | :-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
| ViT-B/16 | 400 | 83.1 | [config](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mae/mae_vit-b-p16_8xb512-coslr-400e_in1k.py) | [config](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/vit-b-p16_ft-8xb128-coslr-100e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/mae/mae_vit-base-p16_8xb512-coslr-400e_in1k-224_20220223-85be947b.pth) &#124; [log](https://download.openmmlab.com/mmselfsup/mae/mae_vit-base-p16_8xb512-coslr-300e_in1k-224_20220210_140925.log.json) |
| Backbone | Pre-train epoch | Fine-tuning Top-1 | Pre-train Config | Fine-tuning Config | Download |
| :------: | :-------------: | :---------------: | :-----------------------------------------------------------------------------------------------------------------------: | :---------------------------------------------------------------------------------------------------------------------------------------------: | :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
| ViT-B/16 | 400 | 83.1 | [config](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mae/mae_vit-b-p16_8xb512-coslr-400e_in1k.py) | [config](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/vit-b-p16_ft-8xb128-coslr-100e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/mae/mae_vit-base-p16_8xb512-coslr-400e_in1k-224_20220223-85be947b.pth) \| [log](https://download.openmmlab.com/mmselfsup/mae/mae_vit-base-p16_8xb512-coslr-300e_in1k-224_20220210_140925.log.json) |
## Citation

View File

@ -1,43 +1,96 @@
# MoCo v1 / v2
# MoCo v2
## Momentum Contrast for Unsupervised Visual Representation Learning (MoCo v1)
<!-- [ABSTRACT] -->
We present Momentum Contrast (MoCo) for unsupervised visual representation learning. From a perspective on contrastive learning as dictionary look-up, we build a dynamic dictionary with a queue and a moving-averaged encoder. This enables building a large and consistent dictionary on-the-fly that facilitates contrastive unsupervised learning. MoCo provides competitive results under the common linear protocol on ImageNet classification. More importantly, the representations learned by MoCo transfer well to downstream tasks.
<!-- [IMAGE] -->
<div align="center">
<img />
</div>
## Citation
> [Improved Baselines with Momentum Contrastive Learning](https://arxiv.org/abs/2003.04297)
<!-- [ALGORITHM] -->
```bibtex
@inproceedings{he2020momentum,
title={Momentum contrast for unsupervised visual representation learning},
author={He, Kaiming and Fan, Haoqi and Wu, Yuxin and Xie, Saining and Girshick, Ross},
booktitle={CVPR},
year={2020}
}
```
## Improved Baselines with Momentum Contrastive Learning (MoCo v2)
<!-- [ABSTRACT] -->
## Abstract
Contrastive unsupervised learning has recently shown encouraging progress, e.g., in Momentum Contrast (MoCo) and SimCLR. In this note, we verify the effectiveness of two of SimCLRs design improvements by implementing them in the MoCo framework. With simple modifications to MoCo—namely, using an MLP projection head and more data augmentation—we establish stronger baselines that outperform SimCLR and do not require large training batches. We hope this will make state-of-the-art unsupervised learning research more accessible.
<!-- [IMAGE] -->
<div align="center">
<img />
<img src="https://user-images.githubusercontent.com/36138628/149720067-b65e5736-d425-45b3-93ed-6f2427fc6217.png" width="500" />
</div>
## Citation
## Results and Models
<!-- [ALGORITHM] -->
**Back to [model_zoo.md](https://github.com/open-mmlab/mmselfsup/blob/master/docs/en/model_zoo.md) to download models.**
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models are pre-trained on ImageNet-1k dataset.
### Classification
The classification benchmarks includes 4 downstream task datasets, **VOC**, **ImageNet**, **iNaturalist2018** and **Places205**. If not specified, the results are Top-1 (%).
#### VOC SVM / Low-shot SVM
The **Best Layer** indicates that the best results are obtained from which layers feature map. For example, if the **Best Layer** is **feature3**, its best result is obtained from the second stage of ResNet (1 for stem layer, 2-5 for 4 stage layers).
Besides, k=1 to 96 indicates the hyper-parameter of Low-shot SVM.
| Self-Supervised Config | Best Layer | SVM | k=1 | k=2 | k=4 | k=8 | k=16 | k=32 | k=64 | k=96 |
| ------------------------------------------------------------------------------------------------------------------------------------------------ | ---------- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mocov2/mocov2_resnet50_8xb32-coslr-200e_in1k.py) | feature5 | 84.04 | 43.14 | 53.29 | 65.34 | 71.03 | 75.42 | 78.48 | 80.88 | 82.23 |
#### ImageNet Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_linear-8xb32-steplr-90e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/resnet50_mhead_linear-8xb32-steplr-90e_in1k.py) for details of config.
The **AvgPool** result is obtained from Linear Evaluation with GlobalAveragePooling. Please refer to [resnet50_linear-8xb32-steplr-100e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/resnet50_linear-8xb32-steplr-100e_in1k.py) for details of config.
| Self-Supervised Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| ------------------------------------------------------------------------------------------------------------------------------------------------ | -------- | -------- | -------- | -------- | -------- | ------- |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mocov2/mocov2_resnet50_8xb32-coslr-200e_in1k.py) | 15.96 | 34.22 | 45.78 | 61.11 | 66.24 | 67.58 |
#### Places205 Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_8xb32-steplr-28e_places205.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/places205/resnet50_mhead_8xb32-steplr-28e_places205.py) for details of config.
| Self-Supervised Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 |
| ------------------------------------------------------------------------------------------------------------------------------------------------ | -------- | -------- | -------- | -------- | -------- |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mocov2/mocov2_resnet50_8xb32-coslr-200e_in1k.py) | 20.92 | 35.72 | 42.62 | 49.79 | 52.25 |
#### ImageNet Nearest-Neighbor Classification
The results are obtained from the features after GlobalAveragePooling. Here, k=10 to 200 indicates different number of nearest neighbors.
| Self-Supervised Config | k=10 | k=20 | k=100 | k=200 |
| ------------------------------------------------------------------------------------------------------------------------------------------------ | ---- | ---- | ----- | ----- |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mocov2/mocov2_resnet50_8xb32-coslr-200e_in1k.py) | 55.6 | 55.7 | 53.7 | 52.5 |
### Detection
The detection benchmarks includes 2 downstream task datasets, **Pascal VOC 2007 + 2012** and **COCO2017**. This benchmark follows the evluation protocols set up by MoCo.
#### Pascal VOC 2007 + 2012
Please refer to [faster_rcnn_r50_c4_mstrain_24k_voc0712.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmdetection/voc0712/faster_rcnn_r50_c4_mstrain_24k_voc0712.py) for details of config.
| Self-Supervised Config | AP50 |
| ------------------------------------------------------------------------------------------------------------------------------------------------ | ----- |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mocov2/mocov2_resnet50_8xb32-coslr-200e_in1k.py) | 81.06 |
#### COCO2017
Please refer to [mask_rcnn_r50_fpn_mstrain_1x_coco.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmdetection/coco/mask_rcnn_r50_fpn_mstrain_1x_coco.py) for details of config.
| Self-Supervised Config | mAP(Box) | AP50(Box) | AP75(Box) | mAP(Mask) | AP50(Mask) | AP75(Mask) |
| ------------------------------------------------------------------------------------------------------------------------------------------------ | -------- | --------- | --------- | --------- | ---------- | ---------- |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mocov2/mocov2_resnet50_8xb32-coslr-200e_in1k.py) | 40.2 | 59.7 | 44.2 | 36.1 | 56.7 | 38.8 |
### Segmentation
The segmentation benchmarks includes 2 downstream task datasets, **Cityscapes** and **Pascal VOC 2012 + Aug**. It follows the evluation protocols set up by MMSegmentation.
#### Pascal VOC 2012 + Aug
Please refer to [fcn_r50-d8_512x512_20k_voc12aug.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmsegmentation/voc12aug/fcn_r50-d8_512x512_20k_voc12aug.py) for details of config.
| Self-Supervised Config | mIOU |
| ------------------------------------------------------------------------------------------------------------------------------------------------ | ----- |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mocov2/mocov2_resnet50_8xb32-coslr-200e_in1k.py) | 67.55 |
## Citation
```bibtex
@article{chen2020improved,
@ -47,99 +100,3 @@ Contrastive unsupervised learning has recently shown encouraging progress, e.g.,
year={2020}
}
```
## Models and Benchmarks
[Back to model_zoo.md](../../../docs/model_zoo.md)
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models were trained on ImageNet1k dataset.
### VOC SVM / Low-shot SVM
The **Best Layer** indicates that the best results are obtained from which layers feature map. For example, if the **Best Layer** is **feature3**, its best result is obtained from the second stage of ResNet (1 for stem layer, 2-5 for 4 stage layers).
Besides, k=1 to 96 indicates the hyper-parameter of Low-shot SVM.
| Model | Config | Best Layer | SVM | k=1 | k=2 | k=4 | k=8 | k=16 | k=32 | k=64 | k=96 |
| --------- | ---------------------------------------------------------------------------- | ---------- | ----- | --- | --- | --- | --- | ---- | ---- | ---- | ---- |
| [model]() | [mocov2_resnet50_8xb32-coslr-200e](mocov2_resnet50_8xb32-coslr-200e_in1k.py) | feature5 | 84.04 | | | | | | | | |
### ImageNet Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_8xb32-steplr-90e.py](../../benchmarks/classification/imagenet/resnet50_mhead_8xb32-steplr-90e_in1k.py) for details of config.
The **AvgPool** result is obtained from Linear Evaluation with GlobalAveragePooling. Please refer to [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | ---------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [mocov2_resnet50_8xb32-coslr-200e](mocov2_resnet50_8xb32-coslr-200e_in1k.py) | 15.96 | 34.22 | 45.78 | 61.11 | 66.24 | 67.56 |
### iNaturalist2018 Linear Evaluation
Please refer to [resnet50_mhead_8xb32-steplr-84e_inat18.py](../../benchmarks/classification/inaturalist2018/resnet50_mhead_8xb32-steplr-84e_inat18.py) and [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | ---------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [mocov2_resnet50_8xb32-coslr-200e](mocov2_resnet50_8xb32-coslr-200e_in1k.py) | | | | | | |
### Places205 Linear Evaluation
Please refer to [resnet50_mhead_8xb32-steplr-28e_places205.py](../../benchmarks/classification/inaturalist2018/resnet50_mhead_8xb32-steplr-28e_places205.py) and [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | ---------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [mocov2_resnet50_8xb32-coslr-200e](mocov2_resnet50_8xb32-coslr-200e_in1k.py) | | | | | | |
#### Semi-Supervised Classification
- In this benchmark, the necks or heads are removed and only the backbone CNN is evaluated by appending a linear classification head. All parameters are fine-tuned.
- When training with 1% ImageNet, we find hyper-parameters especially the learning rate greatly influence the performance. Hence, we prepare a list of settings with the base learning rate from `{0.001, 0.01, 0.1}` and the learning rate multiplier for the head from `{1, 10, 100}`. We choose the best performing setting for each method. The setting of parameters are indicated in the file name. The learning rate is indicated like `1e-1`, `1e-2`, `1e-3` and the learning rate multiplier is indicated like `head1`, `head10`, `head100`.
- Please use --deterministic in this benchmark.
Please refer to the directories `configs/benchmarks/classification/imagenet/imagenet_1percent/` of 1% data and `configs/benchmarks/classification/imagenet/imagenet_10percent/` 10% data for details.
| Model | Pretrain Config | Fine-tuned Config | Top-1 (%) | Top-5 (%) |
| --------- | ---------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------- | --------- |
| [model]() | [mocov2_resnet50_8xb32-coslr-200e](mocov2_resnet50_8xb32-coslr-200e_in1k.py) | [resnet50_head100_4xb64-steplr1e-2-20e_in1k-1pct.py](../../benchmarks/classification/imagenet/imagenet_1percent/resnet50_head100_4xb64-steplr1e-2-20e_in1k-1pct.py) | 38.97 | 67.69 |
### Detection
The detection benchmarks includes 2 downstream task datasets, **Pascal VOC 2007 + 2012** and **COCO2017**. This benchmark follows the evluation protocols set up by MoCo.
#### Pascal VOC 2007 + 2012
Please refer to [faster_rcnn_r50_c4_mstrain_24k.py](../../benchmarks/mmdetection/voc0712/faster_rcnn_r50_c4_mstrain_24k.py) for details of config.
| Model | Config | mAP | AP50 |
| --------- | ---------------------------------------------------------------------------- | --- | ---- |
| [model]() | [mocov2_resnet50_8xb32-coslr-200e](mocov2_resnet50_8xb32-coslr-200e_in1k.py) | | |
#### COCO2017
Please refer to [mask_rcnn_r50_fpn_mstrain_1x.py](../../benchmarks/mmdetection/coco/mask_rcnn_r50_fpn_mstrain_1x.py) for details of config.
| Model | Config | mAP(Box) | AP50(Box) | AP75(Box) | mAP(Mask) | AP50(Mask) | AP75(Mask) |
| --------- | ---------------------------------------------------------------------------- | -------- | --------- | --------- | --------- | ---------- | ---------- |
| [model]() | [mocov2_resnet50_8xb32-coslr-200e](mocov2_resnet50_8xb32-coslr-200e_in1k.py) | | | | | | |
### Segmentation
The segmentation benchmarks includes 2 downstream task datasets, **Cityscapes** and **Pascal VOC 2012 + Aug**. It follows the evluation protocols set up by MMSegmentation.
#### Pascal VOC 2012 + Aug
Please refer to [file]() for details of config.
| Model | Config | mIOU |
| --------- | ---------------------------------------------------------------------------- | ---- |
| [model]() | [mocov2_resnet50_8xb32-coslr-200e](mocov2_resnet50_8xb32-coslr-200e_in1k.py) | |
#### Cityscapes
Please refer to [file]() for details of config.
| Model | Config | mIOU |
| --------- | ---------------------------------------------------------------------------- | ---- |
| [model]() | [mocov2_resnet50_8xb32-coslr-200e](mocov2_resnet50_8xb32-coslr-200e_in1k.py) | |

View File

@ -16,19 +16,19 @@ This paper does not describe a novel method. Instead, it studies a straightforwa
**Back to [model_zoo.md](https://github.com/open-mmlab/mmselfsup/blob/master/docs/en/model_zoo.md) to download models.**
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models were trained on ImageNet1k dataset.
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models are pre-trained on ImageNet-1k dataset.
### Classification
The classification benchmarks includes 4 downstream task datasets, **VOC**, **ImageNet**, **iNaturalist2018** and **Places205**. If not specified, the results are Top-1 (%).
The classification benchmarks includes 4 downstream task datasets, **VOC**, **ImageNet**, **iNaturalist2018** and **Places205**. If not specified, the results are Top-1 (%).
#### ImageNet Linear Evaluation
The **Linear Evaluation** result is obtained by training a linear head upon the pre-trained backbone. Please refer to [vit-small-p16_8xb128-coslr-90e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/vit-small-p16_8xb128-coslr-90e_in1k.py) for details of config.
| Self-Supervised Config | Linear Evaluation |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------- |
| [vit-small-p16_32xb128-fp16-coslr-300e_in1k-224](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mocov3/mocov3_vit-small-p16_32xb128-fp16-coslr-300e_in1k-224.py) | 73.19 |
| Self-Supervised Config | Linear Evaluation |
| --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------- |
| [vit-small-p16_linear-32xb128-fp16-coslr-300e_in1k-224](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mocov3/mocov3_vit-small-p16_linear-32xb128-fp16-coslr-300e_in1k-224.py) | 73.19 |
## Citation

View File

@ -1,8 +1,10 @@
# NPID
## Unsupervised Feature Learning via Non-Parametric Instance Discrimination
> [Unsupervised Feature Learning via Non-Parametric Instance Discrimination](https://arxiv.org/abs/1805.01978)
<!-- [ABSTRACT] -->
<!-- [ALGORITHM] -->
## Abstract
Neural net classifiers trained on data with annotated class labels can also capture apparent visual similarity among categories without being directed to do so. We study whether this observation can be extended beyond the conventional domain of supervised learning: Can we learn a good feature representation that captures apparent similar- ity among instances, instead of classes, by merely asking the feature to be discriminative of individual instances?
@ -10,14 +12,89 @@ We formulate this intuition as a non-parametric classification problem at the in
Our method is also remarkable for consistently improving test performance with more training data and better network architectures. By fine-tuning the learned feature, we further obtain competitive results for semi-supervised learning and object detection tasks. Our non-parametric model is highly compact: With 128 features per image, our method requires only 600MB storage for a million images, enabling fast nearest neighbour retrieval at the run time.
<!-- [IMAGE] -->
<div align="center">
<img />
<img src="https://user-images.githubusercontent.com/36138628/149722257-1651c283-ac68-4cdc-90e6-970d820529af.png" width="800" />
</div>
## Citation
## Results and Models
<!-- [ALGORITHM] -->
**Back to [model_zoo.md](https://github.com/open-mmlab/mmselfsup/blob/master/docs/en/model_zoo.md) to download models.**
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models are pre-trained on ImageNet-1k dataset.
### Classification
The classification benchmarks includes 4 downstream task datasets, **VOC**, **ImageNet**, **iNaturalist2018** and **Places205**. If not specified, the results are Top-1 (%).
#### VOC SVM / Low-shot SVM
The **Best Layer** indicates that the best results are obtained from which layers feature map. For example, if the **Best Layer** is **feature3**, its best result is obtained from the second stage of ResNet (1 for stem layer, 2-5 for 4 stage layers).
Besides, k=1 to 96 indicates the hyper-parameter of Low-shot SVM.
| Self-Supervised Config | Best Layer | SVM | k=1 | k=2 | k=4 | k=8 | k=16 | k=32 | k=64 | k=96 |
| ---------------------------------------------------------------------------------------------------------------------------------------------- | ---------- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- |
| [resnet50_8xb32-steplr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/npid/npid_resnet50_8xb32-steplr-200e_in1k.py) | feature5 | 76.75 | 26.96 | 35.37 | 44.48 | 53.89 | 60.39 | 66.41 | 71.48 | 73.39 |
#### ImageNet Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_linear-8xb32-steplr-90e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/resnet50_mhead_linear-8xb32-steplr-90e_in1k.py) for details of config.
The **AvgPool** result is obtained from Linear Evaluation with GlobalAveragePooling. Please refer to [resnet50_linear-8xb32-steplr-100e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/resnet50_linear-8xb32-steplr-100e_in1k.py) for details of config.
| Self-Supervised Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| ---------------------------------------------------------------------------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [resnet50_8xb32-steplr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/npid/npid_resnet50_8xb32-steplr-200e_in1k.py) | 14.68 | 31.98 | 42.85 | 56.95 | 58.41 | 57.97 |
#### Places205 Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_8xb32-steplr-28e_places205.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/places205/resnet50_mhead_8xb32-steplr-28e_places205.py) for details of config.
| Self-Supervised Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 |
| ---------------------------------------------------------------------------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- |
| [resnet50_8xb32-steplr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/npid/npid_resnet50_8xb32-steplr-200e_in1k.py) | 19.98 | 34.86 | 41.59 | 48.43 | 48.71 |
#### ImageNet Nearest-Neighbor Classification
The results are obtained from the features after GlobalAveragePooling. Here, k=10 to 200 indicates different number of nearest neighbors.
| Self-Supervised Config | k=10 | k=20 | k=100 | k=200 |
| ---------------------------------------------------------------------------------------------------------------------------------------------- | ---- | ---- | ----- | ----- |
| [resnet50_8xb32-steplr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/npid/npid_resnet50_8xb32-steplr-200e_in1k.py) | 42.9 | 44.0 | 43.2 | 42.2 |
### Detection
The detection benchmarks includes 2 downstream task datasets, **Pascal VOC 2007 + 2012** and **COCO2017**. This benchmark follows the evluation protocols set up by MoCo.
#### Pascal VOC 2007 + 2012
Please refer to [faster_rcnn_r50_c4_mstrain_24k_voc0712.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmdetection/voc0712/faster_rcnn_r50_c4_mstrain_24k_voc0712.py) for details of config.
| Self-Supervised Config | AP50 |
| ---------------------------------------------------------------------------------------------------------------------------------------------- | ----- |
| [resnet50_8xb32-steplr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/npid/npid_resnet50_8xb32-steplr-200e_in1k.py) | 79.52 |
#### COCO2017
Please refer to [mask_rcnn_r50_fpn_mstrain_1x_coco.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmdetection/coco/mask_rcnn_r50_fpn_mstrain_1x_coco.py) for details of config.
| Self-Supervised Config | mAP(Box) | AP50(Box) | AP75(Box) | mAP(Mask) | AP50(Mask) | AP75(Mask) |
| ---------------------------------------------------------------------------------------------------------------------------------------------- | -------- | --------- | --------- | --------- | ---------- | ---------- |
| [resnet50_8xb32-steplr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/npid/npid_resnet50_8xb32-steplr-200e_in1k.py) | 38.5 | 57.7 | 42.0 | 34.6 | 54.8 | 37.1 |
### Segmentation
The segmentation benchmarks includes 2 downstream task datasets, **Cityscapes** and **Pascal VOC 2012 + Aug**. It follows the evluation protocols set up by MMSegmentation.
#### Pascal VOC 2012 + Aug
Please refer to [fcn_r50-d8_512x512_20k_voc12aug.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmsegmentation/voc12aug/fcn_r50-d8_512x512_20k_voc12aug.py) for details of config.
| Self-Supervised Config | mIOU |
| ---------------------------------------------------------------------------------------------------------------------------------------------- | ----- |
| [resnet50_8xb32-steplr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/npid/npid_resnet50_8xb32-steplr-200e_in1k.py) | 65.45 |
## Citation
```bibtex
@inproceedings{wu2018unsupervised,
@ -27,99 +104,3 @@ Our method is also remarkable for consistently improving test performance with m
year={2018}
}
```
## Models and Benchmarks
[Back to model_zoo.md](../../../docs/model_zoo.md)
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models were trained on ImageNet1k dataset.
### VOC SVM / Low-shot SVM
The **Best Layer** indicates that the best results are obtained from which layers feature map. For example, if the **Best Layer** is **feature3**, its best result is obtained from the second stage of ResNet (1 for stem layer, 2-5 for 4 stage layers).
Besides, k=1 to 96 indicates the hyper-parameter of Low-shot SVM.
| Model | Config | Best Layer | SVM | k=1 | k=2 | k=4 | k=8 | k=16 | k=32 | k=64 | k=96 |
| --------- | --------------------------------------------------------------------- | ---------- | --- | --- | --- | --- | --- | ---- | ---- | ---- | ---- |
| [model]() | [resnet50_8xb32-steplr-200e](npid_resnet50_8xb32-steplr-200e_in1k.py) | | | | | | | | | | |
### ImageNet Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_8xb32-steplr-90e.py](../../benchmarks/classification/imagenet/resnet50_mhead_8xb32-steplr-90e_in1k.py) for details of config.
The **AvgPool** result is obtained from Linear Evaluation with GlobalAveragePooling. Please refer to [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | --------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb32-steplr-200e](npid_resnet50_8xb32-steplr-200e_in1k.py) | | | | | | |
### iNaturalist2018 Linear Evaluation
Please refer to [resnet50_mhead_8xb32-steplr-84e_inat18.py](../../benchmarks/classification/inaturalist2018/resnet50_mhead_8xb32-steplr-84e_inat18.py) and [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | --------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb32-steplr-200e](npid_resnet50_8xb32-steplr-200e_in1k.py) | | | | | | |
### Places205 Linear Evaluation
Please refer to [resnet50_mhead_8xb32-steplr-28e_places205.py](../../benchmarks/classification/inaturalist2018/resnet50_mhead_8xb32-steplr-28e_places205.py) and [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | --------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb32-steplr-200e](npid_resnet50_8xb32-steplr-200e_in1k.py) | | | | | | |
#### Semi-Supervised Classification
- In this benchmark, the necks or heads are removed and only the backbone CNN is evaluated by appending a linear classification head. All parameters are fine-tuned.
- When training with 1% ImageNet, we find hyper-parameters especially the learning rate greatly influence the performance. Hence, we prepare a list of settings with the base learning rate from `{0.001, 0.01, 0.1}` and the learning rate multiplier for the head from `{1, 10, 100}`. We choose the best performing setting for each method. The setting of parameters are indicated in the file name. The learning rate is indicated like `1e-1`, `1e-2`, `1e-3` and the learning rate multiplier is indicated like `head1`, `head10`, `head100`.
- Please use --deterministic in this benchmark.
Please refer to the directories `configs/benchmarks/classification/imagenet/imagenet_1percent/` of 1% data and `configs/benchmarks/classification/imagenet/imagenet_10percent/` 10% data for details.
| Model | Pretrain Config | Fine-tuned Config | Top-1 (%) | Top-5 (%) |
| --------- | --------------------------------------------------------------------- | ----------------- | --------- | --------- |
| [model]() | [resnet50_8xb32-steplr-200e](npid_resnet50_8xb32-steplr-200e_in1k.py) | | | |
### Detection
The detection benchmarks includes 2 downstream task datasets, **Pascal VOC 2007 + 2012** and **COCO2017**. This benchmark follows the evluation protocols set up by MoCo.
#### Pascal VOC 2007 + 2012
Please refer to [faster_rcnn_r50_c4_mstrain_24k.py](../../benchmarks/mmdetection/voc0712/faster_rcnn_r50_c4_mstrain_24k.py) for details of config.
| Model | Config | mAP | AP50 |
| --------- | --------------------------------------------------------------------- | --- | ---- |
| [model]() | [resnet50_8xb32-steplr-200e](npid_resnet50_8xb32-steplr-200e_in1k.py) | | |
#### COCO2017
Please refer to [mask_rcnn_r50_fpn_mstrain_1x.py](../../benchmarks/mmdetection/coco/mask_rcnn_r50_fpn_mstrain_1x.py) for details of config.
| Model | Config | mAP(Box) | AP50(Box) | AP75(Box) | mAP(Mask) | AP50(Mask) | AP75(Mask) |
| --------- | --------------------------------------------------------------------- | -------- | --------- | --------- | --------- | ---------- | ---------- |
| [model]() | [resnet50_8xb32-steplr-200e](npid_resnet50_8xb32-steplr-200e_in1k.py) | | | | | | |
### Segmentation
The segmentation benchmarks includes 2 downstream task datasets, **Cityscapes** and **Pascal VOC 2012 + Aug**. It follows the evluation protocols set up by MMSegmentation.
#### Pascal VOC 2012 + Aug
Please refer to [file]() for details of config.
| Model | Config | mIOU |
| --------- | --------------------------------------------------------------------- | ---- |
| [model]() | [resnet50_8xb32-steplr-200e](npid_resnet50_8xb32-steplr-200e_in1k.py) | |
#### Cityscapes
Please refer to [file]() for details of config.
| Model | Config | mIOU |
| --------- | --------------------------------------------------------------------- | ---- |
| [model]() | [resnet50_8xb32-steplr-200e](npid_resnet50_8xb32-steplr-200e_in1k.py) | |

View File

@ -1,19 +1,64 @@
# ODC
## Online Deep Clustering for Unsupervised Representation Learning
> [Online Deep Clustering for Unsupervised Representation Learning](https://arxiv.org/abs/2006.10645)
<!-- [ABSTRACT] -->
<!-- [ALGORITHM] -->
## Abstract
Joint clustering and feature learning methods have shown remarkable performance in unsupervised representation learning. However, the training schedule alternating between feature clustering and network parameters update leads to unstable learning of visual representations. To overcome this challenge, we propose Online Deep Clustering (ODC) that performs clustering and network update simultaneously rather than alternatingly. Our key insight is that the cluster centroids should evolve steadily in keeping the classifier stably updated. Specifically, we design and maintain two dynamic memory modules, i.e., samples memory to store samples labels and features, and centroids memory for centroids evolution. We break down the abrupt global clustering into steady memory update and batch-wise label re-assignment. The process is integrated into network update iterations. In this way, labels and the network evolve shoulder-to-shoulder rather than alternatingly. Extensive experiments demonstrate that ODC stabilizes the training process and boosts the performance effectively.
<!-- [IMAGE] -->
<div align="center">
<img />
<img src="https://user-images.githubusercontent.com/36138628/149722645-8da8e5b2-8846-4554-aa3e-727d286b85cd.png" width="700" />
</div>
## Citation
## Results and Models
<!-- [ALGORITHM] -->
**Back to [model_zoo.md](https://github.com/open-mmlab/mmselfsup/blob/master/docs/en/model_zoo.md) to download models.**
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models are pre-trained on ImageNet-1k dataset.
### Classification
The classification benchmarks includes 4 downstream task datasets, **VOC**, **ImageNet**, **iNaturalist2018** and **Places205**. If not specified, the results are Top-1 (%).
#### VOC SVM / Low-shot SVM
The **Best Layer** indicates that the best results are obtained from which layers feature map. For example, if the **Best Layer** is **feature3**, its best result is obtained from the second stage of ResNet (1 for stem layer, 2-5 for 4 stage layers).
Besides, k=1 to 96 indicates the hyper-parameter of Low-shot SVM.
| Self-Supervised Config | Best Layer | SVM | k=1 | k=2 | k=4 | k=8 | k=16 | k=32 | k=64 | k=96 |
| -------------------------------------------------------------------------------------------------------------------------------------------- | ---------- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- |
| [resnet50_8xb64-steplr-440e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/odc/odc_resnet50_8xb64-steplr-440e_in1k.py) | feature5 | 78.42 | 32.42 | 40.27 | 49.95 | 59.96 | 65.71 | 69.99 | 73.64 | 75.13 |
#### ImageNet Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_linear-8xb32-steplr-90e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/resnet50_mhead_linear-8xb32-steplr-90e_in1k.py) for details of config.
The **AvgPool** result is obtained from Linear Evaluation with GlobalAveragePooling. Please refer to [resnet50_linear-8xb32-steplr-100e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/resnet50_linear-8xb32-steplr-100e_in1k.py) for details of config.
| Self-Supervised Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| -------------------------------------------------------------------------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [resnet50_8xb64-steplr-440e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/odc/odc_resnet50_8xb64-steplr-440e_in1k.py) | 14.76 | 31.82 | 42.44 | 55.76 | 57.70 | 53.42 |
#### Places205 Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_8xb32-steplr-28e_places205.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/places205/resnet50_mhead_8xb32-steplr-28e_places205.py) for details of config.
| Self-Supervised Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 |
| -------------------------------------------------------------------------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- |
| [resnet50_8xb64-steplr-440e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/odc/odc_resnet50_8xb64-steplr-440e_in1k.py) | 19.28 | 34.09 | 40.90 | 47.04 | 48.35 |
#### ImageNet Nearest-Neighbor Classification
The results are obtained from the features after GlobalAveragePooling. Here, k=10 to 200 indicates different number of nearest neighbors.
| Self-Supervised Config | k=10 | k=20 | k=100 | k=200 |
| -------------------------------------------------------------------------------------------------------------------------------------------- | ---- | ---- | ----- | ----- |
| [resnet50_8xb64-steplr-440e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/odc/odc_resnet50_8xb64-steplr-440e_in1k.py) | 38.5 | 39.1 | 37.8 | 36.9 |
## Citation
```bibtex
@inproceedings{zhan2020online,
@ -23,100 +68,3 @@ Joint clustering and feature learning methods have shown remarkable performance
year={2020}
}
```
## Models and Benchmarks
[Back to model_zoo.md](../../../docs/model_zoo.md)
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models were trained on ImageNet1k dataset.
### VOC SVM / Low-shot SVM
The **Best Layer** indicates that the best results are obtained from which layers feature map. For example, if the **Best Layer** is **feature3**, its best result is obtained from the second stage of ResNet (1 for stem layer, 2-5 for 4 stage layers).
Besides, k=1 to 96 indicates the hyper-parameter of Low-shot SVM.
| Model | Config | Best Layer | SVM | k=1 | k=2 | k=4 | k=8 | k=16 | k=32 | k=64 | k=96 |
| --------- | -------------------------------------------------------------------- | ---------- | --- | --- | --- | --- | --- | ---- | ---- | ---- | ---- |
| [model]() | [resnet50_8xb64-steplr-440e](odc_resnet50_8xb64-steplr-440e_in1k.py) | | | | | | | | | | |
### ImageNet Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_8xb32-steplr-90e.py](../../benchmarks/classification/imagenet/resnet50_mhead_8xb32-steplr-90e_in1k.py) for details of config.
The **AvgPool** result is obtained from Linear Evaluation with GlobalAveragePooling. Please refer to [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | -------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb64-steplr-440e](odc_resnet50_8xb64-steplr-440e_in1k.py) | | | | | | |
### iNaturalist2018 Linear Evaluation
Please refer to [resnet50_mhead_8xb32-steplr-84e_inat18.py](../../benchmarks/classification/inaturalist2018/resnet50_mhead_8xb32-steplr-84e_inat18.py) and [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | -------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb64-steplr-440e](odc_resnet50_8xb64-steplr-440e_in1k.py) | | | | | | |
### Places205 Linear Evaluation
Please refer to [resnet50_mhead_8xb32-steplr-28e_places205.py](../../benchmarks/classification/inaturalist2018/resnet50_mhead_8xb32-steplr-28e_places205.py) and [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | -------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb64-steplr-440e](odc_resnet50_8xb64-steplr-440e_in1k.py) | | | | | | |
#### Semi-Supervised Classification
- In this benchmark, the necks or heads are removed and only the backbone CNN is evaluated by appending a linear classification head. All parameters are fine-tuned.
- When training with 1% ImageNet, we find hyper-parameters especially the learning rate greatly influence the performance. Hence, we prepare a list of settings with the base learning rate from `{0.001, 0.01, 0.1}` and the learning rate multiplier for the head from `{1, 10, 100}`. We choose the best performing setting for each method. The setting of parameters are indicated in the file name. The learning rate is indicated like `1e-1`, `1e-2`, `1e-3` and the learning rate multiplier is indicated like `head1`, `head10`, `head100`.
- Please use --deterministic in this benchmark.
Please refer to the directories `configs/benchmarks/classification/imagenet/imagenet_1percent/` of 1% data and `configs/benchmarks/classification/imagenet/imagenet_10percent/` 10% data for details.
| Model | Pretrain Config | Fine-tuned Config | Top-1 (%) | Top-5 (%) |
| --------- | -------------------------------------------------------------------- | ----------------- | --------- | --------- |
| [model]() | [resnet50_8xb64-steplr-440e](odc_resnet50_8xb64-steplr-440e_in1k.py) | | | |
### Detection
The detection benchmarks includes 2 downstream task datasets, **Pascal VOC 2007 + 2012** and **COCO2017**. This benchmark follows the evluation protocols set up by MoCo.
#### Pascal VOC 2007 + 2012
Please refer to [faster_rcnn_r50_c4_mstrain_24k.py](../../benchmarks/mmdetection/voc0712/faster_rcnn_r50_c4_mstrain_24k.py) for details of config.
| Model | Config | mAP | AP50 |
| --------- | -------------------------------------------------------------------- | --- | ---- |
| [model]() | [resnet50_8xb64-steplr-440e](odc_resnet50_8xb64-steplr-440e_in1k.py) | | |
#### COCO2017
Please refer to [mask_rcnn_r50_fpn_mstrain_1x.py](../../benchmarks/mmdetection/coco/mask_rcnn_r50_fpn_mstrain_1x.py) for details of config.
| Model | Config | mAP(Box) | AP50(Box) | AP75(Box) | mAP(Mask) | AP50(Mask) | AP75(Mask) |
| --------- | -------------------------------------------------------------------- | -------- | --------- | --------- | --------- | ---------- | ---------- |
| [model]() | [resnet50_8xb64-steplr-440e](odc_resnet50_8xb64-steplr-440e_in1k.py) | | | | | | |
### Segmentation
The segmentation benchmarks includes 2 downstream task datasets, **Cityscapes** and **Pascal VOC 2012 + Aug**. It follows the evluation protocols set up by MMSegmentation.
#### Pascal VOC 2012 + Aug
Please refer to [file]() for details of config.
| Model | Config | mIOU |
| --------- | -------------------------------------------------------------------- | ---- |
| [model]() | [resnet50_8xb64-steplr-440e](odc_resnet50_8xb64-steplr-440e_in1k.py) | |
#### Cityscapes
Please refer to [file]() for details of config.
| Model | Config | mIOU |
| --------- | -------------------------------------------------------------------- | ---- |
| [model]() | [resnet50_8xb64-steplr-440e](odc_resnet50_8xb64-steplr-440e_in1k.py) | |

View File

@ -1,19 +1,96 @@
# Relative Location
## Unsupervised Visual Representation Learning by Context Prediction
> [Unsupervised Visual Representation Learning by Context Prediction](https://arxiv.org/abs/1505.05192)
<!-- [ABSTRACT] -->
<!-- [ALGORITHM] -->
## Abstract
This work explores the use of spatial context as a source of free and plentiful supervisory signal for training a rich visual representation. Given only a large, unlabeled image collection, we extract random pairs of patches from each image and train a convolutional neural net to predict the position of the second patch relative to the first. We argue that doing well on this task requires the model to learn to recognize objects and their parts. We demonstrate that the feature representation learned using this within-image context indeed captures visual similarity across images. For example, this representation allows us to perform unsupervised visual discovery of objects like cats, people, and even birds from the Pascal VOC 2011 detection dataset. Furthermore, we show that the learned ConvNet can be used in the RCNN framework and provides a significant boost over a randomly-initialized ConvNet, resulting in state-of-the-art performance among algorithms which use only Pascal-provided training set annotations.
<!-- [IMAGE] -->
<div align="center">
<img />
<img src="https://user-images.githubusercontent.com/36138628/149723222-76bc89e8-98bf-4ed7-b179-dfe5bc6336ba.png" width="400" />
</div>
## Citation
## Results and Models
<!-- [ALGORITHM] -->
**Back to [model_zoo.md](https://github.com/open-mmlab/mmselfsup/blob/master/docs/en/model_zoo.md) to download models.**
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models are pre-trained on ImageNet-1k dataset.
### Classification
The classification benchmarks includes 4 downstream task datasets, **VOC**, **ImageNet**, **iNaturalist2018** and **Places205**. If not specified, the results are Top-1 (%).
#### VOC SVM / Low-shot SVM
The **Best Layer** indicates that the best results are obtained from which layers feature map. For example, if the **Best Layer** is **feature3**, its best result is obtained from the second stage of ResNet (1 for stem layer, 2-5 for 4 stage layers).
Besides, k=1 to 96 indicates the hyper-parameter of Low-shot SVM.
| Self-Supervised Config | Best Layer | SVM | k=1 | k=2 | k=4 | k=8 | k=16 | k=32 | k=64 | k=96 |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------ | ---------- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- |
| [resnet50_8xb64-steplr-70e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/relative_loc/relative-loc_resnet50_8xb64-steplr-70e_in1k.py) | feature4 | 65.52 | 20.36 | 23.12 | 30.66 | 37.02 | 42.55 | 50.00 | 55.58 | 59.28 |
#### ImageNet Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_linear-8xb32-steplr-90e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/resnet50_mhead_linear-8xb32-steplr-90e_in1k.py) for details of config.
The **AvgPool** result is obtained from Linear Evaluation with GlobalAveragePooling. Please refer to [resnet50_linear-8xb32-steplr-100e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/resnet50_linear-8xb32-steplr-100e_in1k.py) for details of config.
| Self-Supervised Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------ | -------- | -------- | -------- | -------- | -------- | ------- |
| [resnet50_8xb64-steplr-70e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/relative_loc/relative-loc_resnet50_8xb64-steplr-70e_in1k.py) | 15.11 | 30.47 | 42.83 | 51.20 | 40.96 | 39.65 |
#### Places205 Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_8xb32-steplr-28e_places205.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/places205/resnet50_mhead_8xb32-steplr-28e_places205.py) for details of config.
| Self-Supervised Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------ | -------- | -------- | -------- | -------- | -------- |
| [resnet50_8xb64-steplr-70e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/relative_loc/relative-loc_resnet50_8xb64-steplr-70e_in1k.py) | 20.69 | 34.72 | 43.01 | 45.97 | 41.96 |
#### ImageNet Nearest-Neighbor Classification
The results are obtained from the features after GlobalAveragePooling. Here, k=10 to 200 indicates different number of nearest neighbors.
| Self-Supervised Config | k=10 | k=20 | k=100 | k=200 |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------ | ---- | ---- | ----- | ----- |
| [resnet50_8xb64-steplr-70e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/relative_loc/relative-loc_resnet50_8xb64-steplr-70e_in1k.py) | 14.5 | 15.0 | 15.0 | 14.2 |
### Detection
The detection benchmarks includes 2 downstream task datasets, **Pascal VOC 2007 + 2012** and **COCO2017**. This benchmark follows the evluation protocols set up by MoCo.
#### Pascal VOC 2007 + 2012
Please refer to [faster_rcnn_r50_c4_mstrain_24k_voc0712.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmdetection/voc0712/faster_rcnn_r50_c4_mstrain_24k_voc0712.py) for details of config.
| Self-Supervised Config | AP50 |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------ | ----- |
| [resnet50_8xb64-steplr-70e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/relative_loc/relative-loc_resnet50_8xb64-steplr-70e_in1k.py) | 79.70 |
#### COCO2017
Please refer to [mask_rcnn_r50_fpn_mstrain_1x_coco.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmdetection/coco/mask_rcnn_r50_fpn_mstrain_1x_coco.py) for details of config.
| Self-Supervised Config | mAP(Box) | AP50(Box) | AP75(Box) | mAP(Mask) | AP50(Mask) | AP75(Mask) |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------ | -------- | --------- | --------- | --------- | ---------- | ---------- |
| [resnet50_8xb64-steplr-70e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/relative_loc/relative-loc_resnet50_8xb64-steplr-70e_in1k.py) | 37.5 | 56.2 | 41.3 | 33.7 | 53.3 | 36.1 |
### Segmentation
The segmentation benchmarks includes 2 downstream task datasets, **Cityscapes** and **Pascal VOC 2012 + Aug**. It follows the evluation protocols set up by MMSegmentation.
#### Pascal VOC 2012 + Aug
Please refer to [fcn_r50-d8_512x512_20k_voc12aug.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmsegmentation/voc12aug/fcn_r50-d8_512x512_20k_voc12aug.py) for details of config.
| Self-Supervised Config | mIOU |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------ | ----- |
| [resnet50_8xb64-steplr-70e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/relative_loc/relative-loc_resnet50_8xb64-steplr-70e_in1k.py) | 63.49 |
## Citation
```bibtex
@inproceedings{doersch2015unsupervised,
@ -23,100 +100,3 @@ This work explores the use of spatial context as a source of free and plentiful
year={2015}
}
```
## Models and Benchmarks
[Back to model_zoo.md](../../../docs/model_zoo.md)
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models were trained on ImageNet1k dataset.
### VOC SVM / Low-shot SVM
The **Best Layer** indicates that the best results are obtained from which layers feature map. For example, if the **Best Layer** is **feature3**, its best result is obtained from the second stage of ResNet (1 for stem layer, 2-5 for 4 stage layers).
Besides, k=1 to 96 indicates the hyper-parameter of Low-shot SVM.
| Model | Config | Best Layer | SVM | k=1 | k=2 | k=4 | k=8 | k=16 | k=32 | k=64 | k=96 |
| --------- | --------------------------------------------------------------------------- | ---------- | --- | --- | --- | --- | --- | ---- | ---- | ---- | ---- |
| [model]() | [resnet50_8xb64-steplr-70e](relative-loc_resnet50_8xb64-steplr-70e_in1k.py) | | | | | | | | | | |
### ImageNet Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_8xb32-steplr-90e.py](../../benchmarks/classification/imagenet/resnet50_mhead_8xb32-steplr-90e_in1k.py) for details of config.
The **AvgPool** result is obtained from Linear Evaluation with GlobalAveragePooling. Please refer to [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | --------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb64-steplr-70e](relative-loc_resnet50_8xb64-steplr-70e_in1k.py) | | | | | | |
### iNaturalist2018 Linear Evaluation
Please refer to [resnet50_mhead_8xb32-steplr-84e_inat18.py](../../benchmarks/classification/inaturalist2018/resnet50_mhead_8xb32-steplr-84e_inat18.py) and [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | --------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb64-steplr-70e](relative-loc_resnet50_8xb64-steplr-70e_in1k.py) | | | | | | |
### Places205 Linear Evaluation
Please refer to [resnet50_mhead_8xb32-steplr-28e_places205.py](../../benchmarks/classification/inaturalist2018/resnet50_mhead_8xb32-steplr-28e_places205.py) and [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | --------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb64-steplr-70e](relative-loc_resnet50_8xb64-steplr-70e_in1k.py) | | | | | | |
#### Semi-Supervised Classification
- In this benchmark, the necks or heads are removed and only the backbone CNN is evaluated by appending a linear classification head. All parameters are fine-tuned.
- When training with 1% ImageNet, we find hyper-parameters especially the learning rate greatly influence the performance. Hence, we prepare a list of settings with the base learning rate from `{0.001, 0.01, 0.1}` and the learning rate multiplier for the head from `{1, 10, 100}`. We choose the best performing setting for each method. The setting of parameters are indicated in the file name. The learning rate is indicated like `1e-1`, `1e-2`, `1e-3` and the learning rate multiplier is indicated like `head1`, `head10`, `head100`.
- Please use --deterministic in this benchmark.
Please refer to the directories `configs/benchmarks/classification/imagenet/imagenet_1percent/` of 1% data and `configs/benchmarks/classification/imagenet/imagenet_10percent/` 10% data for details.
| Model | Pretrain Config | Fine-tuned Config | Top-1 (%) | Top-5 (%) |
| --------- | --------------------------------------------------------------------------- | ----------------- | --------- | --------- |
| [model]() | [resnet50_8xb64-steplr-70e](relative-loc_resnet50_8xb64-steplr-70e_in1k.py) | | | |
### Detection
The detection benchmarks includes 2 downstream task datasets, **Pascal VOC 2007 + 2012** and **COCO2017**. This benchmark follows the evluation protocols set up by MoCo.
#### Pascal VOC 2007 + 2012
Please refer to [faster_rcnn_r50_c4_mstrain_24k.py](../../benchmarks/mmdetection/voc0712/faster_rcnn_r50_c4_mstrain_24k.py) for details of config.
| Model | Config | mAP | AP50 |
| --------- | --------------------------------------------------------------------------- | --- | ---- |
| [model]() | [resnet50_8xb64-steplr-70e](relative-loc_resnet50_8xb64-steplr-70e_in1k.py) | | |
#### COCO2017
Please refer to [mask_rcnn_r50_fpn_mstrain_1x.py](../../benchmarks/mmdetection/coco/mask_rcnn_r50_fpn_mstrain_1x.py) for details of config.
| Model | Config | mAP(Box) | AP50(Box) | AP75(Box) | mAP(Mask) | AP50(Mask) | AP75(Mask) |
| --------- | --------------------------------------------------------------------------- | -------- | --------- | --------- | --------- | ---------- | ---------- |
| [model]() | [resnet50_8xb64-steplr-70e](relative-loc_resnet50_8xb64-steplr-70e_in1k.py) | | | | | | |
### Segmentation
The segmentation benchmarks includes 2 downstream task datasets, **Cityscapes** and **Pascal VOC 2012 + Aug**. It follows the evluation protocols set up by MMSegmentation.
#### Pascal VOC 2012 + Aug
Please refer to [file]() for details of config.
| Model | Config | mIOU |
| --------- | --------------------------------------------------------------------------- | ---- |
| [model]() | [resnet50_8xb64-steplr-70e](relative-loc_resnet50_8xb64-steplr-70e_in1k.py) | |
#### Cityscapes
Please refer to [file]() for details of config.
| Model | Config | mIOU |
| --------- | --------------------------------------------------------------------------- | ---- |
| [model]() | [resnet50_8xb64-steplr-70e](relative-loc_resnet50_8xb64-steplr-70e_in1k.py) | |

View File

@ -1,19 +1,96 @@
# Rotation Prediction
## Unsupervised Representation Learning by Predicting Image Rotation
> [Unsupervised Representation Learning by Predicting Image Rotation](https://arxiv.org/abs/1803.07728)
<!-- [ABSTRACT] -->
<!-- [ALGORITHM] -->
## Abstract
Over the last years, deep convolutional neural networks (ConvNets) have transformed the field of computer vision thanks to their unparalleled capacity to learn high level semantic image features. However, in order to successfully learn those features, they usually require massive amounts of manually labeled data, which is both expensive and impractical to scale. Therefore, unsupervised semantic feature learning, i.e., learning without requiring manual annotation effort, is of crucial importance in order to successfully harvest the vast amount of visual data that are available today. In our work we propose to learn image features by training ConvNets to recognize the 2d rotation that is applied to the image that it gets as input. We demonstrate both qualitatively and quantitatively that this apparently simple task actually provides a very powerful supervisory signal for semantic feature learning. We exhaustively evaluate our method in various unsupervised feature learning benchmarks and we exhibit in all of them state-of-the-art performance. Specifically, our results on those benchmarks demonstrate dramatic improvements w.r.t. prior state-of-the-art approaches in unsupervised representation learning and thus significantly close the gap with supervised feature learning.
<!-- [IMAGE] -->
<div align="center">
<img />
<img src="https://user-images.githubusercontent.com/36138628/149723477-8f63e237-362e-4962-b405-9bab0f579808.png" width="700" />
</div>
## Citation
## Results and Models
<!-- [ALGORITHM] -->
**Back to [model_zoo.md](https://github.com/open-mmlab/mmselfsup/blob/master/docs/en/model_zoo.md) to download models.**
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models are pre-trained on ImageNet-1k dataset.
### Classification
The classification benchmarks includes 4 downstream task datasets, **VOC**, **ImageNet**, **iNaturalist2018** and **Places205**. If not specified, the results are Top-1 (%).
#### VOC SVM / Low-shot SVM
The **Best Layer** indicates that the best results are obtained from which layers feature map. For example, if the **Best Layer** is **feature3**, its best result is obtained from the second stage of ResNet (1 for stem layer, 2-5 for 4 stage layers).
Besides, k=1 to 96 indicates the hyper-parameter of Low-shot SVM.
| Self-Supervised Config | Best Layer | SVM | k=1 | k=2 | k=4 | k=8 | k=16 | k=32 | k=64 | k=96 |
| -------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- |
| [resnet50_8xb16-steplr-70e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/rotation_pred/rotation-pred_resnet50_8xb16-steplr-70e_in1k.py) | feature4 | 67.70 | 20.60 | 24.35 | 31.41 | 39.17 | 46.56 | 53.37 | 59.14 | 62.42 |
#### ImageNet Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_linear-8xb32-steplr-90e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/resnet50_mhead_linear-8xb32-steplr-90e_in1k.py) for details of config.
The **AvgPool** result is obtained from Linear Evaluation with GlobalAveragePooling. Please refer to [resnet50_linear-8xb32-steplr-100e_in1k.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/resnet50_linear-8xb32-steplr-100e_in1k.py) for details of config.
| Self-Supervised Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| -------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [resnet50_8xb16-steplr-70e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/rotation_pred/rotation-pred_resnet50_8xb16-steplr-70e_in1k.py) | 12.15 | 31.99 | 44.57 | 54.20 | 45.94 | 48.12 |
#### Places205 Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_8xb32-steplr-28e_places205.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/places205/resnet50_mhead_8xb32-steplr-28e_places205.py) for details of config.
| Self-Supervised Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 |
| -------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- |
| [resnet50_8xb16-steplr-70e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/rotation_pred/rotation-pred_resnet50_8xb16-steplr-70e_in1k.py) | 18.94 | 34.72 | 44.53 | 46.30 | 44.12 |
#### ImageNet Nearest-Neighbor Classification
The results are obtained from the features after GlobalAveragePooling. Here, k=10 to 200 indicates different number of nearest neighbors.
| Self-Supervised Config | k=10 | k=20 | k=100 | k=200 |
| -------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---- | ---- | ----- | ----- |
| [resnet50_8xb16-steplr-70e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/rotation_pred/rotation-pred_resnet50_8xb16-steplr-70e_in1k.py) | 11.0 | 11.9 | 12.6 | 12.4 |
### Detection
The detection benchmarks includes 2 downstream task datasets, **Pascal VOC 2007 + 2012** and **COCO2017**. This benchmark follows the evluation protocols set up by MoCo.
#### Pascal VOC 2007 + 2012
Please refer to [faster_rcnn_r50_c4_mstrain_24k_voc0712.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmdetection/voc0712/faster_rcnn_r50_c4_mstrain_24k_voc0712.py) for details of config.
| Self-Supervised Config | AP50 |
| -------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----- |
| [resnet50_8xb16-steplr-70e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/rotation_pred/rotation-pred_resnet50_8xb16-steplr-70e_in1k.py) | 79.67 |
#### COCO2017
Please refer to [mask_rcnn_r50_fpn_mstrain_1x_coco.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmdetection/coco/mask_rcnn_r50_fpn_mstrain_1x_coco.py) for details of config.
| Self-Supervised Config | mAP(Box) | AP50(Box) | AP75(Box) | mAP(Mask) | AP50(Mask) | AP75(Mask) |
| -------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------- | --------- | --------- | --------- | ---------- | ---------- |
| [resnet50_8xb16-steplr-70e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/rotation_pred/rotation-pred_resnet50_8xb16-steplr-70e_in1k.py) | 37.9 | 56.5 | 41.5 | 34.2 | 53.9 | 36.7 |
### Segmentation
The segmentation benchmarks includes 2 downstream task datasets, **Cityscapes** and **Pascal VOC 2012 + Aug**. It follows the evluation protocols set up by MMSegmentation.
#### Pascal VOC 2012 + Aug
Please refer to [fcn_r50-d8_512x512_20k_voc12aug.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmsegmentation/voc12aug/fcn_r50-d8_512x512_20k_voc12aug.py) for details of config.
| Self-Supervised Config | mIOU |
| -------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----- |
| [resnet50_8xb16-steplr-70e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/rotation_pred/rotation-pred_resnet50_8xb16-steplr-70e_in1k.py) | 64.31 |
## Citation
```bibtex
@inproceedings{komodakis2018unsupervised,
@ -23,99 +100,3 @@ Over the last years, deep convolutional neural networks (ConvNets) have transfor
year={2018}
}
```
## Models and Benchmarks
[Back to model_zoo.md](../../../docs/model_zoo.md)
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models were trained on ImageNet1k dataset.
### VOC SVM / Low-shot SVM
The **Best Layer** indicates that the best results are obtained from which layers feature map. For example, if the **Best Layer** is **feature3**, its best result is obtained from the second stage of ResNet (1 for stem layer, 2-5 for 4 stage layers).
Besides, k=1 to 96 indicates the hyper-parameter of Low-shot SVM.
| Model | Config | Best Layer | SVM | k=1 | k=2 | k=4 | k=8 | k=16 | k=32 | k=64 | k=96 |
| --------- | ---------------------------------------------------------------------------- | ---------- | --- | --- | --- | --- | --- | ---- | ---- | ---- | ---- |
| [model]() | [resnet50_8xb16-steplr-70e](rotation-pred_resnet50_8xb16-steplr-70e_in1k.py) | | | | | | | | | | |
### ImageNet Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_8xb32-steplr-90e.py](../../benchmarks/classification/imagenet/resnet50_mhead_8xb32-steplr-90e_in1k.py) for details of config.
The **AvgPool** result is obtained from Linear Evaluation with GlobalAveragePooling. Please refer to [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | ---------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb16-steplr-70e](rotation-pred_resnet50_8xb16-steplr-70e_in1k.py) | | | | | | |
### iNaturalist2018 Linear Evaluation
Please refer to [resnet50_mhead_8xb32-steplr-84e_inat18.py](../../benchmarks/classification/inaturalist2018/resnet50_mhead_8xb32-steplr-84e_inat18.py) and [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | ---------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb16-steplr-70e](rotation-pred_resnet50_8xb16-steplr-70e_in1k.py) | | | | | | |
### Places205 Linear Evaluation
Please refer to [resnet50_mhead_8xb32-steplr-28e_places205.py](../../benchmarks/classification/inaturalist2018/resnet50_mhead_8xb32-steplr-28e_places205.py) and [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | ---------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb16-steplr-70e](rotation-pred_resnet50_8xb16-steplr-70e_in1k.py) | | | | | | |
#### Semi-Supervised Classification
- In this benchmark, the necks or heads are removed and only the backbone CNN is evaluated by appending a linear classification head. All parameters are fine-tuned.
- When training with 1% ImageNet, we find hyper-parameters especially the learning rate greatly influence the performance. Hence, we prepare a list of settings with the base learning rate from `{0.001, 0.01, 0.1}` and the learning rate multiplier for the head from `{1, 10, 100}`. We choose the best performing setting for each method. The setting of parameters are indicated in the file name. The learning rate is indicated like `1e-1`, `1e-2`, `1e-3` and the learning rate multiplier is indicated like `head1`, `head10`, `head100`.
- Please use --deterministic in this benchmark.
Please refer to the directories `configs/benchmarks/classification/imagenet/imagenet_1percent/` of 1% data and `configs/benchmarks/classification/imagenet/imagenet_10percent/` 10% data for details.
| Model | Pretrain Config | Fine-tuned Config | Top-1 (%) | Top-5 (%) |
| --------- | ---------------------------------------------------------------------------- | ----------------- | --------- | --------- |
| [model]() | [resnet50_8xb16-steplr-70e](rotation-pred_resnet50_8xb16-steplr-70e_in1k.py) | | | |
### Detection
The detection benchmarks includes 2 downstream task datasets, **Pascal VOC 2007 + 2012** and **COCO2017**. This benchmark follows the evluation protocols set up by MoCo.
#### Pascal VOC 2007 + 2012
Please refer to [faster_rcnn_r50_c4_mstrain_24k.py](../../benchmarks/mmdetection/voc0712/faster_rcnn_r50_c4_mstrain_24k.py) for details of config.
| Model | Config | mAP | AP50 |
| --------- | ---------------------------------------------------------------------------- | --- | ---- |
| [model]() | [resnet50_8xb16-steplr-70e](rotation-pred_resnet50_8xb16-steplr-70e_in1k.py) | | |
#### COCO2017
Please refer to [mask_rcnn_r50_fpn_mstrain_1x.py](../../benchmarks/mmdetection/coco/mask_rcnn_r50_fpn_mstrain_1x.py) for details of config.
| Model | Config | mAP(Box) | AP50(Box) | AP75(Box) | mAP(Mask) | AP50(Mask) | AP75(Mask) |
| --------- | ---------------------------------------------------------------------------- | -------- | --------- | --------- | --------- | ---------- | ---------- |
| [model]() | [resnet50_8xb16-steplr-70e](rotation-pred_resnet50_8xb16-steplr-70e_in1k.py) | | | | | | |
### Segmentation
The segmentation benchmarks includes 2 downstream task datasets, **Cityscapes** and **Pascal VOC 2012 + Aug**. It follows the evluation protocols set up by MMSegmentation.
#### Pascal VOC 2012 + Aug
Please refer to [file]() for details of config.
| Model | Config | mIOU |
| --------- | ---------------------------------------------------------------------------- | ---- |
| [model]() | [resnet50_8xb16-steplr-70e](rotation-pred_resnet50_8xb16-steplr-70e_in1k.py) | |
#### Cityscapes
Please refer to [file]() for details of config.
| Model | Config | mIOU |
| --------- | ---------------------------------------------------------------------------- | ---- |
| [model]() | [resnet50_8xb16-steplr-70e](rotation-pred_resnet50_8xb16-steplr-70e_in1k.py) | |

View File

@ -1,19 +1,97 @@
# SimCLR
## A Simple Framework for Contrastive Learning of Visual Representations
> [A Simple Framework for Contrastive Learning of Visual Representations](https://arxiv.org/abs/2002.05709)
<!-- [ABSTRACT] -->
<!-- [ALGORITHM] -->
## Abstract
This paper presents SimCLR: a simple framework for contrastive learning of visual representations. We simplify recently proposed contrastive self-supervised learning algorithms without requiring specialized architectures or a memory bank. In order to understand what enables the contrastive prediction tasks to learn useful representations, we systematically study the major components of our framework. We show that (1) composition of data augmentations plays a critical role in defining effective predictive tasks, (2) introducing a learnable nonlinear transformation between the representation and the contrastive loss substantially improves the quality of the learned representations, and (3) contrastive learning benefits from larger batch sizes and more training steps compared to supervised learning. By combining these findings, we are able to considerably outperform previous methods for self-supervised and semi-supervised learning on ImageNet. A linear classifier trained on self-supervised representations learned by SimCLR achieves 76.5% top-1 accuracy, which is a 7% relative improvement over previous state-of-the-art, matching the performance of a supervised ResNet-50.
<!-- [IMAGE] -->
<div align="center">
<img />
<img src="https://user-images.githubusercontent.com/36138628/149723851-cf5f309e-d891-454d-90c0-e5337e5a11ed.png" width="400" />
</div>
## Citation
## Results and Models
<!-- [ALGORITHM] -->
**Back to [model_zoo.md](https://github.com/open-mmlab/mmselfsup/blob/master/docs/en/model_zoo.md) to download models.**
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models are pre-trained on ImageNet-1k dataset.
### Classification
The classification benchmarks includes 4 downstream task datasets, **VOC**, **ImageNet**, **iNaturalist2018** and **Places205**. If not specified, the results are Top-1 (%).
#### VOC SVM / Low-shot SVM
The **Best Layer** indicates that the best results are obtained from which layers feature map. For example, if the **Best Layer** is **feature3**, its best result is obtained from the second stage of ResNet (1 for stem layer, 2-5 for 4 stage layers).
Besides, k=1 to 96 indicates the hyper-parameter of Low-shot SVM.
| Self-Supervised Config | Best Layer | SVM | k=1 | k=2 | k=4 | k=8 | k=16 | k=32 | k=64 | k=96 |
| ------------------------------------------------------------------------------------------------------------------------------------------------ | ---------- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ---- |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simclr/simclr_resnet50_8xb32-coslr-200e_in1k.py) | feature5 | 79.98 | 35.02 | 42.79 | 54.87 | 61.91 | 67.38 | 71.88 | 75.56 | 77.4 |
#### ImageNet Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_linear-8xb32-steplr-90e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/resnet50_mhead_linear-8xb32-steplr-90e_in1k.py) for details of config.
The **AvgPool** result is obtained from Linear Evaluation with GlobalAveragePooling. Please refer to [resnet50_linear-8xb512-coslr-90e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/resnet50_linear-8xb512-coslr-90e_in1k.py) for details of config.
| Self-Supervised Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| ---------------------------------------------------------------------------------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simclr/simclr_resnet50_8xb32-coslr-200e_in1k.py) | 16.29 | 31.11 | 39.99 | 55.06 | 62.91 | 62.56 |
| [resnet50_16xb256-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simclr/simclr_resnet50_16xb256-coslr-200e_in1k.py) | 15.44 | 31.47 | 41.83 | 59.44 | 66.41 | 66.66 |
#### Places205 Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_8xb32-steplr-28e_places205.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/places205/resnet50_mhead_8xb32-steplr-28e_places205.py) for details of config.
| Self-Supervised Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 |
| ------------------------------------------------------------------------------------------------------------------------------------------------ | -------- | -------- | -------- | -------- | -------- |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simclr/simclr_resnet50_8xb32-coslr-200e_in1k.py) | 20.60 | 33.62 | 38.86 | 45.25 | 50.91 |
#### ImageNet Nearest-Neighbor Classification
The results are obtained from the features after GlobalAveragePooling. Here, k=10 to 200 indicates different number of nearest neighbors.
| Self-Supervised Config | k=10 | k=20 | k=100 | k=200 |
| ------------------------------------------------------------------------------------------------------------------------------------------------ | ---- | ---- | ----- | ----- |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simclr/simclr_resnet50_8xb32-coslr-200e_in1k.py) | 47.8 | 48.4 | 46.7 | 45.2 |
### Detection
The detection benchmarks includes 2 downstream task datasets, **Pascal VOC 2007 + 2012** and **COCO2017**. This benchmark follows the evluation protocols set up by MoCo.
#### Pascal VOC 2007 + 2012
Please refer to [faster_rcnn_r50_c4_mstrain_24k_voc0712.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmdetection/voc0712/faster_rcnn_r50_c4_mstrain_24k_voc0712.py) for details of config.
| Self-Supervised Config | AP50 |
| ------------------------------------------------------------------------------------------------------------------------------------------------ | ----- |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simclr/simclr_resnet50_8xb32-coslr-200e_in1k.py) | 79.38 |
#### COCO2017
Please refer to [mask_rcnn_r50_fpn_mstrain_1x_coco.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmdetection/coco/mask_rcnn_r50_fpn_mstrain_1x_coco.py) for details of config.
| Self-Supervised Config | mAP(Box) | AP50(Box) | AP75(Box) | mAP(Mask) | AP50(Mask) | AP75(Mask) |
| ------------------------------------------------------------------------------------------------------------------------------------------------ | -------- | --------- | --------- | --------- | ---------- | ---------- |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simclr/simclr_resnet50_8xb32-coslr-200e_in1k.py) | 38.7 | 58.1 | 42.4 | 34.9 | 55.3 | 37.5 |
### Segmentation
The segmentation benchmarks includes 2 downstream task datasets, **Cityscapes** and **Pascal VOC 2012 + Aug**. It follows the evluation protocols set up by MMSegmentation.
#### Pascal VOC 2012 + Aug
Please refer to [fcn_r50-d8_512x512_20k_voc12aug.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmsegmentation/voc12aug/fcn_r50-d8_512x512_20k_voc12aug.py) for details of config.
| Self-Supervised Config | mIOU |
| ------------------------------------------------------------------------------------------------------------------------------------------------ | ----- |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simclr/simclr_resnet50_8xb32-coslr-200e_in1k.py) | 64.03 |
## Citation
```bibtex
@inproceedings{chen2020simple,
@ -23,99 +101,3 @@ This paper presents SimCLR: a simple framework for contrastive learning of visua
year={2020},
}
```
## Models and Benchmarks
[Back to model_zoo.md](../../../docs/model_zoo.md)
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models were trained on ImageNet1k dataset.
### VOC SVM / Low-shot SVM
The **Best Layer** indicates that the best results are obtained from which layers feature map. For example, if the **Best Layer** is **feature3**, its best result is obtained from the second stage of ResNet (1 for stem layer, 2-5 for 4 stage layers).
Besides, k=1 to 96 indicates the hyper-parameter of Low-shot SVM.
| Model | Config | Best Layer | SVM | k=1 | k=2 | k=4 | k=8 | k=16 | k=32 | k=64 | k=96 |
| --------- | --------------------------------------------------------------------- | ---------- | --- | --- | --- | --- | --- | ---- | ---- | ---- | ---- |
| [model]() | [resnet50_8xb32-coslr-200e](simclr_resnet50_8xb32-coslr-200e_in1k.py) | | | | | | | | | | |
### ImageNet Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_8xb32-steplr-90e.py](../../benchmarks/classification/imagenet/resnet50_mhead_8xb32-steplr-90e_in1k.py) for details of config.
The **AvgPool** result is obtained from Linear Evaluation with GlobalAveragePooling. Please refer to [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | --------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb32-coslr-200e](simclr_resnet50_8xb32-coslr-200e_in1k.py) | | | | | | |
### iNaturalist2018 Linear Evaluation
Please refer to [resnet50_mhead_8xb32-steplr-84e_inat18.py](../../benchmarks/classification/inaturalist2018/resnet50_mhead_8xb32-steplr-84e_inat18.py) and [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | --------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb32-coslr-200e](simclr_resnet50_8xb32-coslr-200e_in1k.py) | | | | | | |
### Places205 Linear Evaluation
Please refer to [resnet50_mhead_8xb32-steplr-28e_places205.py](../../benchmarks/classification/inaturalist2018/resnet50_mhead_8xb32-steplr-28e_places205.py) and [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | --------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb32-coslr-200e](simclr_resnet50_8xb32-coslr-200e_in1k.py) | | | | | | |
#### Semi-Supervised Classification
- In this benchmark, the necks or heads are removed and only the backbone CNN is evaluated by appending a linear classification head. All parameters are fine-tuned.
- When training with 1% ImageNet, we find hyper-parameters especially the learning rate greatly influence the performance. Hence, we prepare a list of settings with the base learning rate from `{0.001, 0.01, 0.1}` and the learning rate multiplier for the head from `{1, 10, 100}`. We choose the best performing setting for each method. The setting of parameters are indicated in the file name. The learning rate is indicated like `1e-1`, `1e-2`, `1e-3` and the learning rate multiplier is indicated like `head1`, `head10`, `head100`.
- Please use --deterministic in this benchmark.
Please refer to the directories `configs/benchmarks/classification/imagenet/imagenet_1percent/` of 1% data and `configs/benchmarks/classification/imagenet/imagenet_10percent/` 10% data for details.
| Model | Pretrain Config | Fine-tuned Config | Top-1 (%) | Top-5 (%) |
| --------- | --------------------------------------------------------------------- | ----------------- | --------- | --------- |
| [model]() | [resnet50_8xb32-coslr-200e](simclr_resnet50_8xb32-coslr-200e_in1k.py) | | | |
### Detection
The detection benchmarks includes 2 downstream task datasets, **Pascal VOC 2007 + 2012** and **COCO2017**. This benchmark follows the evluation protocols set up by MoCo.
#### Pascal VOC 2007 + 2012
Please refer to [faster_rcnn_r50_c4_mstrain_24k.py](../../benchmarks/mmdetection/voc0712/faster_rcnn_r50_c4_mstrain_24k.py) for details of config.
| Model | Config | mAP | AP50 |
| --------- | --------------------------------------------------------------------- | --- | ---- |
| [model]() | [resnet50_8xb32-coslr-200e](simclr_resnet50_8xb32-coslr-200e_in1k.py) | | |
#### COCO2017
Please refer to [mask_rcnn_r50_fpn_mstrain_1x.py](../../benchmarks/mmdetection/coco/mask_rcnn_r50_fpn_mstrain_1x.py) for details of config.
| Model | Config | mAP(Box) | AP50(Box) | AP75(Box) | mAP(Mask) | AP50(Mask) | AP75(Mask) |
| --------- | --------------------------------------------------------------------- | -------- | --------- | --------- | --------- | ---------- | ---------- |
| [model]() | [resnet50_8xb32-coslr-200e](simclr_resnet50_8xb32-coslr-200e_in1k.py) | | | | | | |
### Segmentation
The segmentation benchmarks includes 2 downstream task datasets, **Cityscapes** and **Pascal VOC 2012 + Aug**. It follows the evluation protocols set up by MMSegmentation.
#### Pascal VOC 2012 + Aug
Please refer to [file]() for details of config.
| Model | Config | mIOU |
| --------- | --------------------------------------------------------------------- | ---- |
| [model]() | [resnet50_8xb32-coslr-200e](simclr_resnet50_8xb32-coslr-200e_in1k.py) | |
#### Cityscapes
Please refer to [file]() for details of config.
| Model | Config | mIOU |
| --------- | --------------------------------------------------------------------- | ---- |
| [model]() | [resnet50_8xb32-coslr-200e](simclr_resnet50_8xb32-coslr-200e_in1k.py) | |

View File

@ -12,17 +12,13 @@ This paper presents SimMIM, a simple framework for masked image modeling. We sim
<img src="https://user-images.githubusercontent.com/30762564/159404597-ac6d3a44-ee59-4cdc-8f6f-506a7d1b18b6.png" width="40%"/>
</div>
## Models and Benchmarks
Here, we report the results of the model, and more results will be coming soon.
| Backbone | Pre-train epoch | Fine-tuning Top-1 | Pre-train Config | Fine-tuning Config | Download |
| :------: | :-------------: | :---------------: | :-------------------------------------------------: | :---------------------------------------------------------------------------------------: | :-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
| Swin-Base | 100 | 82.9 | [config](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simmim/simmim_swin-base_16xb128-coslr-100e_in1k-192) | [config](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/swin-base_ft-8xb256-coslr-100e_in1k) | [model](https://download.openmmlab.com/mmselfsup/simmim/simmim_swin-base_16xb128-coslr-100e_in1k-192_20220316-1d090125.pth) &#124; [log](https://download.openmmlab.com/mmselfsup/simmim/simmim_swin-base_16xb128-coslr-100e_in1k-192_20220316-1d090125.log.json) |
| Backbone | Pre-train epoch | Fine-tuning Top-1 | Pre-train Config | Fine-tuning Config | Download |
| :-------: | :-------------: | :---------------: | :-------------------------------------------------------------------------------------------------------------------------------: | :------------------------------------------------------------------------------------------------------------------------------------------: | :-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
| Swin-Base | 100 | 82.9 | [config](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simmim/simmim_swin-base_16xb128-coslr-100e_in1k-192) | [config](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/swin-base_ft-8xb256-coslr-100e_in1k) | [model](https://download.openmmlab.com/mmselfsup/simmim/simmim_swin-base_16xb128-coslr-100e_in1k-192_20220316-1d090125.pth) \| [log](https://download.openmmlab.com/mmselfsup/simmim/simmim_swin-base_16xb128-coslr-100e_in1k-192_20220316-1d090125.log.json) |
## Citation

View File

@ -1,19 +1,103 @@
# SimSiam
## Exploring Simple Siamese Representation Learning
> [Exploring Simple Siamese Representation Learning](https://arxiv.org/abs/2011.10566)
<!-- [ABSTRACT] -->
<!-- [ALGORITHM] -->
## Abstract
Siamese networks have become a common structure in various recent models for unsupervised visual representation learning. These models maximize the similarity between two augmentations of one image, subject to certain conditions for avoiding collapsing solutions. In this paper, we report surprising empirical results that simple Siamese networks can learn meaningful representations even using none of the following: (i) negative sample pairs, (ii) large batches, (iii) momentum encoders. Our experiments show that collapsing solutions do exist for the loss and structure, but a stop-gradient operation plays an essential role in preventing collapsing. We provide a hypothesis on the implication of stop-gradient, and further show proof-of-concept experiments verifying it. Our “SimSiam” method achieves competitive results on ImageNet and downstream tasks. We hope this simple baseline will motivate people to rethink the roles of Siamese architectures for unsupervised representation learning.
<!-- [IMAGE] -->
<div align="center">
<img />
<img src="https://user-images.githubusercontent.com/36138628/149724180-bc7bac6a-fcb8-421e-b8f1-9550c624d154.png" width="500" />
</div>
## Citation
## Results and Models
<!-- [ALGORITHM] -->
**Back to [model_zoo.md](https://github.com/open-mmlab/mmselfsup/blob/master/docs/en/model_zoo.md) to download models.**
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models are pre-trained on ImageNet-1k dataset.
### Classification
The classification benchmarks includes 4 downstream task datasets, **VOC**, **ImageNet**, **iNaturalist2018** and **Places205**. If not specified, the results are Top-1 (%).
#### VOC SVM / Low-shot SVM
The **Best Layer** indicates that the best results are obtained from which layers feature map. For example, if the **Best Layer** is **feature3**, its best result is obtained from the second stage of ResNet (1 for stem layer, 2-5 for 4 stage layers).
Besides, k=1 to 96 indicates the hyper-parameter of Low-shot SVM.
| Self-Supervised Config | Best Layer | SVM | k=1 | k=2 | k=4 | k=8 | k=16 | k=32 | k=64 | k=96 |
| -------------------------------------------------------------------------------------------------------------------------------------------------- | ---------- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- |
| [resnet50_8xb32-coslr-100e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simsiam/simsiam_resnet50_8xb32-coslr-100e_in1k.py) | feature5 | 84.64 | 39.65 | 49.86 | 62.48 | 69.50 | 74.48 | 78.31 | 81.06 | 82.56 |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simsiam/simsiam_resnet50_8xb32-coslr-200e_in1k.py) | feature5 | 85.20 | 39.85 | 50.44 | 63.73 | 70.93 | 75.74 | 79.42 | 82.02 | 83.44 |
#### ImageNet Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_linear-8xb32-steplr-90e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/resnet50_mhead_linear-8xb32-steplr-90e_in1k.py) for details of config.
The **AvgPool** result is obtained from Linear Evaluation with GlobalAveragePooling. Please refer to [resnet50_linear-8xb512-coslr-90e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/resnet50_linear-8xb512-coslr-90e_in1k.py) for details of config.
| Self-Supervised Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| -------------------------------------------------------------------------------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [resnet50_8xb32-coslr-100e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simsiam/simsiam_resnet50_8xb32-coslr-100e_in1k.py) | 16.27 | 33.77 | 45.80 | 60.83 | 68.21 | 68.28 |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simsiam/simsiam_resnet50_8xb32-coslr-200e_in1k.py) | 15.57 | 37.21 | 47.28 | 62.21 | 69.85 | 69.84 |
#### Places205 Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_8xb32-steplr-28e_places205.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/places205/resnet50_mhead_8xb32-steplr-28e_places205.py) for details of config.
| Self-Supervised Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 |
| -------------------------------------------------------------------------------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- |
| [resnet50_8xb32-coslr-100e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simsiam/simsiam_resnet50_8xb32-coslr-100e_in1k.py) | 21.32 | 35.66 | 43.05 | 50.79 | 53.27 |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simsiam/simsiam_resnet50_8xb32-coslr-200e_in1k.py) | 21.17 | 35.85 | 43.49 | 50.99 | 54.10 |
#### ImageNet Nearest-Neighbor Classification
The results are obtained from the features after GlobalAveragePooling. Here, k=10 to 200 indicates different number of nearest neighbors.
| Self-Supervised Config | k=10 | k=20 | k=100 | k=200 |
| -------------------------------------------------------------------------------------------------------------------------------------------------- | ---- | ---- | ----- | ----- |
| [resnet50_8xb32-coslr-100e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simsiam/simsiam_resnet50_8xb32-coslr-100e_in1k.py) | 57.4 | 57.6 | 55.8 | 54.2 |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simsiam/simsiam_resnet50_8xb32-coslr-200e_in1k.py) | 60.2 | 60.4 | 58.8 | 57.4 |
### Detection
The detection benchmarks includes 2 downstream task datasets, **Pascal VOC 2007 + 2012** and **COCO2017**. This benchmark follows the evluation protocols set up by MoCo.
#### Pascal VOC 2007 + 2012
Please refer to [faster_rcnn_r50_c4_mstrain_24k_voc0712.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmdetection/voc0712/faster_rcnn_r50_c4_mstrain_24k_voc0712.py) for details of config.
| Self-Supervised Config | AP50 |
| -------------------------------------------------------------------------------------------------------------------------------------------------- | ----- |
| [resnet50_8xb32-coslr-100e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simsiam/simsiam_resnet50_8xb32-coslr-100e_in1k.py) | 79.80 |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simsiam/simsiam_resnet50_8xb32-coslr-200e_in1k.py) | 79.85 |
#### COCO2017
Please refer to [mask_rcnn_r50_fpn_mstrain_1x_coco.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmdetection/coco/mask_rcnn_r50_fpn_mstrain_1x_coco.py) for details of config.
| Self-Supervised Config | mAP(Box) | AP50(Box) | AP75(Box) | mAP(Mask) | AP50(Mask) | AP75(Mask) |
| -------------------------------------------------------------------------------------------------------------------------------------------------- | -------- | --------- | --------- | --------- | ---------- | ---------- |
| [resnet50_8xb32-coslr-100e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simsiam/simsiam_resnet50_8xb32-coslr-100e_in1k.py) | 38.6 | 57.6 | 42.3 | 34.6 | 54.8 | 36.9 |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simsiam/simsiam_resnet50_8xb32-coslr-200e_in1k.py) | 38.8 | 58.0 | 42.3 | 34.9 | 55.3 | 37.6 |
### Segmentation
The segmentation benchmarks includes 2 downstream task datasets, **Cityscapes** and **Pascal VOC 2012 + Aug**. It follows the evluation protocols set up by MMSegmentation.
#### Pascal VOC 2012 + Aug
Please refer to [fcn_r50-d8_512x512_20k_voc12aug.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmsegmentation/voc12aug/fcn_r50-d8_512x512_20k_voc12aug.py) for details of config.
| Self-Supervised Config | mIOU |
| -------------------------------------------------------------------------------------------------------------------------------------------------- | ----- |
| [resnet50_8xb32-coslr-100e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simsiam/simsiam_resnet50_8xb32-coslr-100e_in1k.py) | 48.35 |
| [resnet50_8xb32-coslr-200e](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simsiam/simsiam_resnet50_8xb32-coslr-200e_in1k.py) | 46.27 |
## Citation
```bibtex
@inproceedings{chen2021exploring,
@ -23,107 +107,3 @@ Siamese networks have become a common structure in various recent models for uns
year={2021}
}
```
## Models and Benchmarks
[Back to model_zoo.md](../../../docs/model_zoo.md)
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models were trained on ImageNet1k dataset.
### VOC SVM / Low-shot SVM
The **Best Layer** indicates that the best results are obtained from which layers feature map. For example, if the **Best Layer** is **feature3**, its best result is obtained from the second stage of ResNet (1 for stem layer, 2-5 for 4 stage layers).
Besides, k=1 to 96 indicates the hyper-parameter of Low-shot SVM.
| Model | Config | Best Layer | SVM | k=1 | k=2 | k=4 | k=8 | k=16 | k=32 | k=64 | k=96 |
| --------- | ---------------------------------------------------------------------- | ---------- | --- | --- | --- | --- | --- | ---- | ---- | ---- | ---- |
| [model]() | [resnet50_8xb32-coslr-100e](simsiam_resnet50_8xb32-coslr-100e_in1k.py) | | | | | | | | | | |
| [model]() | [resnet50_8xb32-coslr-200e](simsiam_resnet50_8xb32-coslr-200e_in1k.py) | | | | | | | | | | |
### ImageNet Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_8xb32-steplr-90e.py](../../benchmarks/classification/imagenet/resnet50_mhead_8xb32-steplr-90e_in1k.py) for details of config.
The **AvgPool** result is obtained from Linear Evaluation with GlobalAveragePooling. Please refer to [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | ---------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb32-coslr-100e](simsiam_resnet50_8xb32-coslr-100e_in1k.py) | | | | | | |
| [model]() | [resnet50_8xb32-coslr-200e](simsiam_resnet50_8xb32-coslr-200e_in1k.py) | | | | | | |
### iNaturalist2018 Linear Evaluation
Please refer to [resnet50_mhead_8xb32-steplr-84e_inat18.py](../../benchmarks/classification/inaturalist2018/resnet50_mhead_8xb32-steplr-84e_inat18.py) and [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | ---------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb32-coslr-100e](simsiam_resnet50_8xb32-coslr-100e_in1k.py) | | | | | | |
| [model]() | [resnet50_8xb32-coslr-200e](simsiam_resnet50_8xb32-coslr-200e_in1k.py) | | | | | | |
### Places205 Linear Evaluation
Please refer to [resnet50_mhead_8xb32-steplr-28e_places205.py](../../benchmarks/classification/inaturalist2018/resnet50_mhead_8xb32-steplr-28e_places205.py) and [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | ---------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb32-coslr-100e](simsiam_resnet50_8xb32-coslr-100e_in1k.py) | | | | | | |
| [model]() | [resnet50_8xb32-coslr-200e](simsiam_resnet50_8xb32-coslr-200e_in1k.py) | | | | | | |
#### Semi-Supervised Classification
- In this benchmark, the necks or heads are removed and only the backbone CNN is evaluated by appending a linear classification head. All parameters are fine-tuned.
- When training with 1% ImageNet, we find hyper-parameters especially the learning rate greatly influence the performance. Hence, we prepare a list of settings with the base learning rate from `{0.001, 0.01, 0.1}` and the learning rate multiplier for the head from `{1, 10, 100}`. We choose the best performing setting for each method. The setting of parameters are indicated in the file name. The learning rate is indicated like `1e-1`, `1e-2`, `1e-3` and the learning rate multiplier is indicated like `head1`, `head10`, `head100`.
- Please use --deterministic in this benchmark.
Please refer to the directories `configs/benchmarks/classification/imagenet/imagenet_1percent/` of 1% data and `configs/benchmarks/classification/imagenet/imagenet_10percent/` 10% data for details.
| Model | Pretrain Config | Fine-tuned Config | Top-1 (%) | Top-5 (%) |
| --------- | ----------------------------------------------------------------------------------- | ----------------- | --------- | --------- |
| [model]() | [resnet50_8xb32-accum16-coslr-200e](byol_resnet50_8xb32-accum16-coslr-200e_in1k.py) | | | |
### Detection
The detection benchmarks includes 2 downstream task datasets, **Pascal VOC 2007 + 2012** and **COCO2017**. This benchmark follows the evluation protocols set up by MoCo.
#### Pascal VOC 2007 + 2012
Please refer to [faster_rcnn_r50_c4_mstrain_24k.py](../../benchmarks/mmdetection/voc0712/faster_rcnn_r50_c4_mstrain_24k.py) for details of config.
| Model | Config | mAP | AP50 |
| --------- | ---------------------------------------------------------------------- | --- | ---- |
| [model]() | [resnet50_8xb32-coslr-100e](simsiam_resnet50_8xb32-coslr-100e_in1k.py) | | |
| [model]() | [resnet50_8xb32-coslr-200e](simsiam_resnet50_8xb32-coslr-200e_in1k.py) | | |
#### COCO2017
Please refer to [mask_rcnn_r50_fpn_mstrain_1x.py](../../benchmarks/mmdetection/coco/mask_rcnn_r50_fpn_mstrain_1x.py) for details of config.
| Model | Config | mAP(Box) | AP50(Box) | AP75(Box) | mAP(Mask) | AP50(Mask) | AP75(Mask) |
| --------- | ---------------------------------------------------------------------- | -------- | --------- | --------- | --------- | ---------- | ---------- |
| [model]() | [resnet50_8xb32-coslr-100e](simsiam_resnet50_8xb32-coslr-100e_in1k.py) | | | | | | |
| [model]() | [resnet50_8xb32-coslr-200e](simsiam_resnet50_8xb32-coslr-200e_in1k.py) | | | | | | |
### Segmentation
The segmentation benchmarks includes 2 downstream task datasets, **Cityscapes** and **Pascal VOC 2012 + Aug**. It follows the evluation protocols set up by MMSegmentation.
#### Pascal VOC 2012 + Aug
Please refer to [file]() for details of config.
| Model | Config | mIOU |
| --------- | ---------------------------------------------------------------------- | ---- |
| [model]() | [resnet50_8xb32-coslr-100e](simsiam_resnet50_8xb32-coslr-100e_in1k.py) | |
| [model]() | [resnet50_8xb32-coslr-200e](simsiam_resnet50_8xb32-coslr-200e_in1k.py) | |
#### Cityscapes
Please refer to [file]() for details of config.
| Model | Config | mIOU |
| --------- | ---------------------------------------------------------------------- | ---- |
| [model]() | [resnet50_8xb32-coslr-100e](simsiam_resnet50_8xb32-coslr-100e_in1k.py) | |
| [model]() | [resnet50_8xb32-coslr-200e](simsiam_resnet50_8xb32-coslr-200e_in1k.py) | |

View File

@ -1,19 +1,96 @@
# SwAV
## Unsupervised Learning of Visual Features by Contrasting Cluster Assignments
> [Unsupervised Learning of Visual Features by Contrasting Cluster Assignments](https://arxiv.org/abs/2006.09882)
<!-- [ABSTRACT] -->
<!-- [ALGORITHM] -->
## Abstract
Unsupervised image representations have significantly reduced the gap with supervised pretraining, notably with the recent achievements of contrastive learning methods. These contrastive methods typically work online and rely on a large number of explicit pairwise feature comparisons, which is computationally challenging. In this paper, we propose an online algorithm, SwAV, that takes advantage of contrastive methods without requiring to compute pairwise comparisons. Specifically, our method simultaneously clusters the data while enforcing consistency between cluster assignments produced for different augmentations (or “views”) of the same image, instead of comparing features directly as in contrastive learning. Simply put, we use a “swapped” prediction mechanism where we predict the code of a view from the representation of another view. Our method can be trained with large and small batches and can scale to unlimited amounts of data. Compared to previous contrastive methods, our method is more memory efficient since it does not require a large memory bank or a special momentum network. In addition, we also propose a new data augmentation strategy, multi-crop, that uses a mix of views with different resolutions in place of two full-resolution views, without increasing the memory or compute requirements.
<!-- [IMAGE] -->
<div align="center">
<img />
<img src="https://user-images.githubusercontent.com/36138628/149724517-9f1e7bdf-04c7-43e3-92f4-2b8fc1399123.png" width="500" />
</div>
## Citation
## Results and Models
<!-- [ALGORITHM] -->
**Back to [model_zoo.md](https://github.com/open-mmlab/mmselfsup/blob/master/docs/en/model_zoo.md) to download models.**
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models are pre-trained on ImageNet-1k dataset.
### Classification
The classification benchmarks includes 4 downstream task datasets, **VOC**, **ImageNet**, **iNaturalist2018** and **Places205**. If not specified, the results are Top-1 (%).
#### VOC SVM / Low-shot SVM
The **Best Layer** indicates that the best results are obtained from which layers feature map. For example, if the **Best Layer** is **feature3**, its best result is obtained from the second stage of ResNet (1 for stem layer, 2-5 for 4 stage layers).
Besides, k=1 to 96 indicates the hyper-parameter of Low-shot SVM.
| Self-Supervised Config | Best Layer | SVM | k=1 | k=2 | k=4 | k=8 | k=16 | k=32 | k=64 | k=96 |
| ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- |
| [resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/swav/swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96.py) | feature5 | 87.00 | 44.68 | 55.41 | 67.64 | 73.67 | 78.14 | 81.58 | 83.98 | 85.15 |
#### ImageNet Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_linear-8xb32-steplr-90e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/resnet50_mhead_linear-8xb32-steplr-90e_in1k.py) for details of config.
The **AvgPool** result is obtained from Linear Evaluation with GlobalAveragePooling. Please refer to [resnet50_linear-8xb32-coslr-100e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/resnet50_linear-8xb32-coslr-100e_in1k.py) for details of config.
| Self-Supervised Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/swav/swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96.py) | 16.98 | 34.96 | 49.26 | 65.98 | 70.74 | 70.47 |
#### Places205 Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_8xb32-steplr-28e_places205.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/places205/resnet50_mhead_8xb32-steplr-28e_places205.py) for details of config.
| Self-Supervised Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 |
| ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- |
| [resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/swav/swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96.py) | 23.33 | 35.45 | 43.13 | 51.98 | 55.09 |
#### ImageNet Nearest-Neighbor Classification
The results are obtained from the features after GlobalAveragePooling. Here, k=10 to 200 indicates different number of nearest neighbors.
| Self-Supervised Config | k=10 | k=20 | k=100 | k=200 |
| ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---- | ---- | ----- | ----- |
| [resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/swav/swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96.py) | 60.5 | 60.6 | 59.0 | 57.6 |
### Detection
The detection benchmarks includes 2 downstream task datasets, **Pascal VOC 2007 + 2012** and **COCO2017**. This benchmark follows the evluation protocols set up by MoCo.
#### Pascal VOC 2007 + 2012
Please refer to [faster_rcnn_r50_c4_mstrain_24k_voc0712.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmdetection/voc0712/faster_rcnn_r50_c4_mstrain_24k_voc0712.py) for details of config.
| Self-Supervised Config | AP50 |
| ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----- |
| [resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/swav/swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96.py) | 77.64 |
#### COCO2017
Please refer to [mask_rcnn_r50_fpn_mstrain_1x_coco.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmdetection/coco/mask_rcnn_r50_fpn_mstrain_1x_coco.py) for details of config.
| Self-Supervised Config | mAP(Box) | AP50(Box) | AP75(Box) | mAP(Mask) | AP50(Mask) | AP75(Mask) |
| ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------- | --------- | --------- | --------- | ---------- | ---------- |
| [resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/swav/swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96.py) | 40.2 | 60.5 | 43.9 | 36.3 | 57.5 | 38.8 |
### Segmentation
The segmentation benchmarks includes 2 downstream task datasets, **Cityscapes** and **Pascal VOC 2012 + Aug**. It follows the evluation protocols set up by MMSegmentation.
#### Pascal VOC 2012 + Aug
Please refer to [fcn_r50-d8_512x512_20k_voc12aug.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmsegmentation/voc12aug/fcn_r50-d8_512x512_20k_voc12aug.py) for details of config.
| Self-Supervised Config | mIOU |
| ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----- |
| [resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/swav/swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96.py) | 63.73 |
## Citation
```bibtex
@article{caron2020unsupervised,
@ -23,99 +100,3 @@ Unsupervised image representations have significantly reduced the gap with super
year={2020}
}
```
## Models and Benchmarks
[Back to model_zoo.md](../../../docs/model_zoo.md)
In this page, we provide benchmarks as much as possible to evaluate our pre-trained models. If not mentioned, all models were trained on ImageNet1k dataset.
### VOC SVM / Low-shot SVM
The **Best Layer** indicates that the best results are obtained from which layers feature map. For example, if the **Best Layer** is **feature3**, its best result is obtained from the second stage of ResNet (1 for stem layer, 2-5 for 4 stage layers).
Besides, k=1 to 96 indicates the hyper-parameter of Low-shot SVM.
| Model | Config | Best Layer | SVM | k=1 | k=2 | k=4 | k=8 | k=16 | k=32 | k=64 | k=96 |
| --------- | ---------------------------------------------------------------------------------------------- | ---------- | --- | --- | --- | --- | --- | ---- | ---- | ---- | ---- |
| [model]() | [resnet50_8xb32-mcrop-2-6-coslr-200e](swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96.py) | | | | | | | | | | |
### ImageNet Linear Evaluation
The **Feature1 - Feature5** don't have the GlobalAveragePooling, the feature map is pooled to the specific dimensions and then follows a Linear layer to do the classification. Please refer to [resnet50_mhead_8xb32-steplr-90e.py](../../benchmarks/classification/imagenet/resnet50_mhead_8xb32-steplr-90e_in1k.py) for details of config.
The **AvgPool** result is obtained from Linear Evaluation with GlobalAveragePooling. Please refer to [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | ---------------------------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb32-mcrop-2-6-coslr-200e](swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96.py) | | | | | | |
### iNaturalist2018 Linear Evaluation
Please refer to [resnet50_mhead_8xb32-steplr-84e_inat18.py](../../benchmarks/classification/inaturalist2018/resnet50_mhead_8xb32-steplr-84e_inat18.py) and [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | ---------------------------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb32-mcrop-2-6-coslr-200e](swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96.py) | | | | | | |
### Places205 Linear Evaluation
Please refer to [resnet50_mhead_8xb32-steplr-28e_places205.py](../../benchmarks/classification/inaturalist2018/resnet50_mhead_8xb32-steplr-28e_places205.py) and [file name]() for details of config.
| Model | Config | Feature1 | Feature2 | Feature3 | Feature4 | Feature5 | AvgPool |
| --------- | ---------------------------------------------------------------------------------------------- | -------- | -------- | -------- | -------- | -------- | ------- |
| [model]() | [resnet50_8xb32-mcrop-2-6-coslr-200e](swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96.py) | | | | | | |
#### Semi-Supervised Classification
- In this benchmark, the necks or heads are removed and only the backbone CNN is evaluated by appending a linear classification head. All parameters are fine-tuned.
- When training with 1% ImageNet, we find hyper-parameters especially the learning rate greatly influence the performance. Hence, we prepare a list of settings with the base learning rate from `{0.001, 0.01, 0.1}` and the learning rate multiplier for the head from `{1, 10, 100}`. We choose the best performing setting for each method. The setting of parameters are indicated in the file name. The learning rate is indicated like `1e-1`, `1e-2`, `1e-3` and the learning rate multiplier is indicated like `head1`, `head10`, `head100`.
- Please use --deterministic in this benchmark.
Please refer to the directories `configs/benchmarks/classification/imagenet/imagenet_1percent/` of 1% data and `configs/benchmarks/classification/imagenet/imagenet_10percent/` 10% data for details.
| Model | Pretrain Config | Fine-tuned Config | Top-1 (%) | Top-5 (%) |
| --------- | ---------------------------------------------------------------------------------------------- | ----------------- | --------- | --------- |
| [model]() | [resnet50_8xb32-mcrop-2-6-coslr-200e](swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96.py) | | | |
### Detection
The detection benchmarks includes 2 downstream task datasets, **Pascal VOC 2007 + 2012** and **COCO2017**. This benchmark follows the evluation protocols set up by MoCo.
#### Pascal VOC 2007 + 2012
Please refer to [faster_rcnn_r50_c4_mstrain_24k.py](../../benchmarks/mmdetection/voc0712/faster_rcnn_r50_c4_mstrain_24k.py) for details of config.
| Model | Config | mAP | AP50 |
| --------- | ---------------------------------------------------------------------------------------------- | --- | ---- |
| [model]() | [resnet50_8xb32-mcrop-2-6-coslr-200e](swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96.py) | | |
#### COCO2017
Please refer to [mask_rcnn_r50_fpn_mstrain_1x.py](../../benchmarks/mmdetection/coco/mask_rcnn_r50_fpn_mstrain_1x.py) for details of config.
| Model | Config | mAP(Box) | AP50(Box) | AP75(Box) | mAP(Mask) | AP50(Mask) | AP75(Mask) |
| --------- | ---------------------------------------------------------------------------------------------- | -------- | --------- | --------- | --------- | ---------- | ---------- |
| [model]() | [resnet50_8xb32-mcrop-2-6-coslr-200e](swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96.py) | | | | | | |
### Segmentation
The segmentation benchmarks includes 2 downstream task datasets, **Cityscapes** and **Pascal VOC 2012 + Aug**. It follows the evluation protocols set up by MMSegmentation.
#### Pascal VOC 2012 + Aug
Please refer to [file]() for details of config.
| Model | Config | mIOU |
| --------- | ---------------------------------------------------------------------------------------------- | ---- |
| [model]() | [resnet50_8xb32-mcrop-2-6-coslr-200e](swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96.py) | |
#### Cityscapes
Please refer to [file]() for details of config.
| Model | Config | mIOU |
| --------- | ---------------------------------------------------------------------------------------------- | ---- |
| [model]() | [resnet50_8xb32-mcrop-2-6-coslr-200e](swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96.py) | |

View File

@ -2,188 +2,245 @@
## MMSelfSup
### v0.9.1 (31/05/2022)
#### 亮点
- 更新 **BYOL** 模型和结果 ([#319](https://github.com/open-mmlab/mmselfsup/pull/319))
- 改进部分文档
#### 新特性
- 更新 **BYOL** 模型和结果 ([#319](https://github.com/open-mmlab/mmselfsup/pull/319))
#### Bug 修复
- 对于 CAE 和 MAE 设置 qkv 偏置参数 ([#303](https://github.com/open-mmlab/mmselfsup/pull/303))
- 修复 MAE 配置文件拼写错误 ([#307](https://github.com/open-mmlab/mmselfsup/pull/307))
#### 改进
- 修改文件名 ([#304](https://github.com/open-mmlab/mmselfsup/pull/304))
- 应用 mdformat ([#311](https://github.com/open-mmlab/mmselfsup/pull/311))
#### 文档
- 改正教程中的打字错误 ([#308](https://github.com/open-mmlab/mmselfsup/pull/308))
- 配置 Myst-parser ([#309](https://github.com/open-mmlab/mmselfsup/pull/309))
- 更新文档算法简介 ([#310](https://github.com/open-mmlab/mmselfsup/pull/310))
- 改进安装文档 ([#317](https://github.com/open-mmlab/mmselfsup/pull/317))
- 改进首页 README ([#318](https://github.com/open-mmlab/mmselfsup/pull/318))
### v0.9.0 (29/04/2022)
#### 亮点
* 支持 **CAE** ([#284](https://github.com/open-mmlab/mmselfsup/pull/284))
* 支持 **Barlow Twins** ([#207](https://github.com/open-mmlab/mmselfsup/pull/207))
- 支持 **CAE** ([#284](https://github.com/open-mmlab/mmselfsup/pull/284))
- 支持 **Barlow Twins** ([#207](https://github.com/open-mmlab/mmselfsup/pull/207))
#### 新特性
* 支持 CAE ([#284](https://github.com/open-mmlab/mmselfsup/pull/284))
* 支持 Barlow twins ([#207](https://github.com/open-mmlab/mmselfsup/pull/207))
* 增加 SimMIM 192 预训练及 224 微调的结果 ([#280](https://github.com/open-mmlab/mmselfsup/pull/280))
* 增加 MAE fp16 预训练设置 ([#271](https://github.com/open-mmlab/mmselfsup/pull/271))
- 支持 CAE ([#284](https://github.com/open-mmlab/mmselfsup/pull/284))
- 支持 Barlow twins ([#207](https://github.com/open-mmlab/mmselfsup/pull/207))
- 增加 SimMIM 192 预训练及 224 微调的结果 ([#280](https://github.com/open-mmlab/mmselfsup/pull/280))
- 增加 MAE fp16 预训练设置 ([#271](https://github.com/open-mmlab/mmselfsup/pull/271))
#### Bug 修复
* 修复参数问题 ([#290](https://github.com/open-mmlab/mmselfsup/pull/290))
* 在 MAE 配置中修改 imgs_per_gpu 为 samples_per_gpu ([#278](https://github.com/open-mmlab/mmselfsup/pull/278))
* 使用 prefetch dataloader 时避免 GPU 内存溢出 ([#277](https://github.com/open-mmlab/mmselfsup/pull/277))
* 修复在注册自定义钩子时键值错误的问题 ([#273](https://github.com/open-mmlab/mmselfsup/pull/273))
- 修复参数问题 ([#290](https://github.com/open-mmlab/mmselfsup/pull/290))
- 在 MAE 配置中修改 imgs_per_gpu 为 samples_per_gpu ([#278](https://github.com/open-mmlab/mmselfsup/pull/278))
- 使用 prefetch dataloader 时避免 GPU 内存溢出 ([#277](https://github.com/open-mmlab/mmselfsup/pull/277))
- 修复在注册自定义钩子时键值错误的问题 ([#273](https://github.com/open-mmlab/mmselfsup/pull/273))
#### 改进
* 更新 SimCLR 模型和结果 ([#295](https://github.com/open-mmlab/mmselfsup/pull/295))
* 单元测试减少内存使用 ([#291](https://github.com/open-mmlab/mmselfsup/pull/291))
* 去除 pytorch 1.5 测试 ([#288](https://github.com/open-mmlab/mmselfsup/pull/288))
* 重命名线性评估配置文件 ([#281](https://github.com/open-mmlab/mmselfsup/pull/281))
* 为 api 增加单元测试 ([#276](https://github.com/open-mmlab/mmselfsup/pull/276))
- 更新 SimCLR 模型和结果 ([#295](https://github.com/open-mmlab/mmselfsup/pull/295))
- 单元测试减少内存使用 ([#291](https://github.com/open-mmlab/mmselfsup/pull/291))
- 去除 pytorch 1.5 测试 ([#288](https://github.com/open-mmlab/mmselfsup/pull/288))
- 重命名线性评估配置文件 ([#281](https://github.com/open-mmlab/mmselfsup/pull/281))
- 为 api 增加单元测试 ([#276](https://github.com/open-mmlab/mmselfsup/pull/276))
#### 文档
* 在模型库增加 SimMIM 并修复链接 ([#272](https://github.com/open-mmlab/mmselfsup/pull/272))
- 在模型库增加 SimMIM 并修复链接 ([#272](https://github.com/open-mmlab/mmselfsup/pull/272))
### v0.8.0 (31/03/2022)
#### 亮点
* 支持 **SimMIM** ([#239](https://github.com/open-mmlab/mmselfsup/pull/239))
* 增加 **KNN** 基准测试,支持中间 checkpoint 和提取的 backbone 权重进行评估 ([#243](https://github.com/open-mmlab/mmselfsup/pull/243))
* 支持 ImageNet-21k 数据集 ([#225](https://github.com/open-mmlab/mmselfsup/pull/225))
- 支持 **SimMIM** ([#239](https://github.com/open-mmlab/mmselfsup/pull/239))
- 增加 **KNN** 基准测试,支持中间 checkpoint 和提取的 backbone 权重进行评估 ([#243](https://github.com/open-mmlab/mmselfsup/pull/243))
- 支持 ImageNet-21k 数据集 ([#225](https://github.com/open-mmlab/mmselfsup/pull/225))
#### 新特性
* 支持 SimMIM ([#239](https://github.com/open-mmlab/mmselfsup/pull/239))
* 增加 KNN 基准测试,支持中间 checkpoint 和提取的 backbone 权重进行评估 ([#243](https://github.com/open-mmlab/mmselfsup/pull/243))
* 支持 ImageNet-21k 数据集 ([#225](https://github.com/open-mmlab/mmselfsup/pull/225))
* 支持自动继续 checkpoint 文件的训练 ([#245](https://github.com/open-mmlab/mmselfsup/pull/245))
- 支持 SimMIM ([#239](https://github.com/open-mmlab/mmselfsup/pull/239))
- 增加 KNN 基准测试,支持中间 checkpoint 和提取的 backbone 权重进行评估 ([#243](https://github.com/open-mmlab/mmselfsup/pull/243))
- 支持 ImageNet-21k 数据集 ([#225](https://github.com/open-mmlab/mmselfsup/pull/225))
- 支持自动继续 checkpoint 文件的训练 ([#245](https://github.com/open-mmlab/mmselfsup/pull/245))
#### Bug 修复
* 在分布式 sampler 中增加种子 ([#250](https://github.com/open-mmlab/mmselfsup/pull/250))
* 修复 dist_test_svm_epoch.sh 中参数位置问题 ([#260](https://github.com/open-mmlab/mmselfsup/pull/260))
* 修复 prepare_voc07_cls.sh 中 mkdir 潜在错误 ([#261](https://github.com/open-mmlab/mmselfsup/pull/261))
- 在分布式 sampler 中增加种子 ([#250](https://github.com/open-mmlab/mmselfsup/pull/250))
- 修复 dist_test_svm_epoch.sh 中参数位置问题 ([#260](https://github.com/open-mmlab/mmselfsup/pull/260))
- 修复 prepare_voc07_cls.sh 中 mkdir 潜在错误 ([#261](https://github.com/open-mmlab/mmselfsup/pull/261))
#### 改进
* 更新命令行参数模式 ([#253](https://github.com/open-mmlab/mmselfsup/pull/253))
- 更新命令行参数模式 ([#253](https://github.com/open-mmlab/mmselfsup/pull/253))
#### 文档
* 修复 6_benchmarks.md 中命令文档([#263](https://github.com/open-mmlab/mmselfsup/pull/263))
* 翻译 6_benchmarks.md 到中文 ([#262](https://github.com/open-mmlab/mmselfsup/pull/262))
- 修复 6_benchmarks.md 中命令文档([#263](https://github.com/open-mmlab/mmselfsup/pull/263))
- 翻译 6_benchmarks.md 到中文 ([#262](https://github.com/open-mmlab/mmselfsup/pull/262))
### v0.7.0 (03/03/2022)
#### 亮点
* 支持 MAE 算法 ([#221](https://github.com/open-mmlab/mmselfsup/pull/221))
* 增加 Places205 下游基准测试 ([#210](https://github.com/open-mmlab/mmselfsup/pull/210))
* 在 CI 工作流中添加 Windows 测试 ([#215](https://github.com/open-mmlab/mmselfsup/pull/215))
- 支持 MAE 算法 ([#221](https://github.com/open-mmlab/mmselfsup/pull/221))
- 增加 Places205 下游基准测试 ([#210](https://github.com/open-mmlab/mmselfsup/pull/210))
- 在 CI 工作流中添加 Windows 测试 ([#215](https://github.com/open-mmlab/mmselfsup/pull/215))
#### 新特性
* 支持 MAE 算法 ([#221](https://github.com/open-mmlab/mmselfsup/pull/221))
* 增加 Places205 下游基准测试 ([#210](https://github.com/open-mmlab/mmselfsup/pull/210))
- 支持 MAE 算法 ([#221](https://github.com/open-mmlab/mmselfsup/pull/221))
- 增加 Places205 下游基准测试 ([#210](https://github.com/open-mmlab/mmselfsup/pull/210))
#### Bug 修复
* 修复部分配置文件中的错误 ([#200](https://github.com/open-mmlab/mmselfsup/pull/200))
* 修复图像读取通道问题并更新相关结果 ([#210](https://github.com/open-mmlab/mmselfsup/pull/210))
* 修复在使用 prefetch 时,部分 dataset 输出格式不匹配的问题 ([#218](https://github.com/open-mmlab/mmselfsup/pull/218))
* 修复 t-sne 'no init_cfg' 的错误 ([#222](https://github.com/open-mmlab/mmselfsup/pull/222))
- 修复部分配置文件中的错误 ([#200](https://github.com/open-mmlab/mmselfsup/pull/200))
- 修复图像读取通道问题并更新相关结果 ([#210](https://github.com/open-mmlab/mmselfsup/pull/210))
- 修复在使用 prefetch 时,部分 dataset 输出格式不匹配的问题 ([#218](https://github.com/open-mmlab/mmselfsup/pull/218))
- 修复 t-sne 'no init_cfg' 的错误 ([#222](https://github.com/open-mmlab/mmselfsup/pull/222))
#### 改进
* 配置文件中弃用 `imgs_per_gpu` 改用 `samples_per_gpu` ([#204](https://github.com/open-mmlab/mmselfsup/pull/204))
* 更新 MMCV 的安装方式 ([#208](https://github.com/open-mmlab/mmselfsup/pull/208))
* 为 算法 readme 和代码版权增加 pre-commit 钩子 ([#213](https://github.com/open-mmlab/mmselfsup/pull/213))
* 在 CI 工作流中添加 Windows 测试 ([#215](https://github.com/open-mmlab/mmselfsup/pull/215))
- 配置文件中弃用 `imgs_per_gpu` 改用 `samples_per_gpu` ([#204](https://github.com/open-mmlab/mmselfsup/pull/204))
- 更新 MMCV 的安装方式 ([#208](https://github.com/open-mmlab/mmselfsup/pull/208))
- 为 算法 readme 和代码版权增加 pre-commit 钩子 ([#213](https://github.com/open-mmlab/mmselfsup/pull/213))
- 在 CI 工作流中添加 Windows 测试 ([#215](https://github.com/open-mmlab/mmselfsup/pull/215))
#### 文档
* 将 0_config.md 翻译成中文 ([#216](https://github.com/open-mmlab/mmselfsup/pull/216))
* 更新主页 OpenMMLab 项目和介绍 ([#219](https://github.com/open-mmlab/mmselfsup/pull/219))
- 将 0_config.md 翻译成中文 ([#216](https://github.com/open-mmlab/mmselfsup/pull/216))
- 更新主页 OpenMMLab 项目和介绍 ([#219](https://github.com/open-mmlab/mmselfsup/pull/219))
### v0.6.0 (02/02/2022)
#### 亮点
* 支持基于 vision transformer 的 MoCo v3 ([#194](https://github.com/open-mmlab/mmselfsup/pull/194))
* 加速训练和启动时间 ([#181](https://github.com/open-mmlab/mmselfsup/pull/181))
* 支持 cpu 训练 ([#188](https://github.com/open-mmlab/mmselfsup/pull/188))
- 支持基于 vision transformer 的 MoCo v3 ([#194](https://github.com/open-mmlab/mmselfsup/pull/194))
- 加速训练和启动时间 ([#181](https://github.com/open-mmlab/mmselfsup/pull/181))
- 支持 cpu 训练 ([#188](https://github.com/open-mmlab/mmselfsup/pull/188))
#### 新特性
* 支持基于 vision transformer 的 MoCo v3 ([#194](https://github.com/open-mmlab/mmselfsup/pull/194))
* 支持 cpu 训练 ([#188](https://github.com/open-mmlab/mmselfsup/pull/188))
- 支持基于 vision transformer 的 MoCo v3 ([#194](https://github.com/open-mmlab/mmselfsup/pull/194))
- 支持 cpu 训练 ([#188](https://github.com/open-mmlab/mmselfsup/pull/188))
#### Bug 修复
* 修复问题 ([#159](https://github.com/open-mmlab/mmselfsup/issues/159), [#160](https://github.com/open-mmlab/mmselfsup/issues/160)) 中提到的相关 bugs ([#161](https://github.com/open-mmlab/mmselfsup/pull/161))
* 修复 `RandomAppliedTrans` 中缺失的 prob 赋值 ([#173](https://github.com/open-mmlab/mmselfsup/pull/173))
* 修复 k-means losses 显示的 bug ([#182](https://github.com/open-mmlab/mmselfsup/pull/182))
* 修复非分布式多 gpu 训练/测试中的 bug ([#189](https://github.com/open-mmlab/mmselfsup/pull/189))
* 修复加载 cifar 数据集时的 bug ([#191](https://github.com/open-mmlab/mmselfsup/pull/191))
* 修复 `dataset.evaluate` 的参数 bug ([#192](https://github.com/open-mmlab/mmselfsup/pull/192))
- 修复问题 ([#159](https://github.com/open-mmlab/mmselfsup/issues/159), [#160](https://github.com/open-mmlab/mmselfsup/issues/160)) 中提到的相关 bugs ([#161](https://github.com/open-mmlab/mmselfsup/pull/161))
- 修复 `RandomAppliedTrans` 中缺失的 prob 赋值 ([#173](https://github.com/open-mmlab/mmselfsup/pull/173))
- 修复 k-means losses 显示的 bug ([#182](https://github.com/open-mmlab/mmselfsup/pull/182))
- 修复非分布式多 gpu 训练/测试中的 bug ([#189](https://github.com/open-mmlab/mmselfsup/pull/189))
- 修复加载 cifar 数据集时的 bug ([#191](https://github.com/open-mmlab/mmselfsup/pull/191))
- 修复 `dataset.evaluate` 的参数 bug ([#192](https://github.com/open-mmlab/mmselfsup/pull/192))
#### 改进
* 取消之前在 CI 中未完成的运行 ([#145](https://github.com/open-mmlab/mmselfsup/pull/145))
* 增强 MIM 功能 ([#152](https://github.com/open-mmlab/mmselfsup/pull/152))
* 更改某些特定文件时跳过 CI ([#154](https://github.com/open-mmlab/mmselfsup/pull/154))
* 在构建 eval 优化器时添加 `drop_last` 选项 ([#158](https://github.com/open-mmlab/mmselfsup/pull/158))
* 弃用对 “python setup.py test” 的支持 ([#174](https://github.com/open-mmlab/mmselfsup/pull/174))
* 加速训练和启动时间 ([#181](https://github.com/open-mmlab/mmselfsup/pull/181))
* 升级 `isort` 到 5.10.1 ([#184](https://github.com/open-mmlab/mmselfsup/pull/184))
- 取消之前在 CI 中未完成的运行 ([#145](https://github.com/open-mmlab/mmselfsup/pull/145))
- 增强 MIM 功能 ([#152](https://github.com/open-mmlab/mmselfsup/pull/152))
- 更改某些特定文件时跳过 CI ([#154](https://github.com/open-mmlab/mmselfsup/pull/154))
- 在构建 eval 优化器时添加 `drop_last` 选项 ([#158](https://github.com/open-mmlab/mmselfsup/pull/158))
- 弃用对 “python setup.py test” 的支持 ([#174](https://github.com/open-mmlab/mmselfsup/pull/174))
- 加速训练和启动时间 ([#181](https://github.com/open-mmlab/mmselfsup/pull/181))
- 升级 `isort` 到 5.10.1 ([#184](https://github.com/open-mmlab/mmselfsup/pull/184))
#### 文档
* 重构文档目录结构 ([#146](https://github.com/open-mmlab/mmselfsup/pull/146))
* 修复 readthedocs ([#148](https://github.com/open-mmlab/mmselfsup/pull/148), [#149](https://github.com/open-mmlab/mmselfsup/pull/149), [#153](https://github.com/open-mmlab/mmselfsup/pull/153))
* 修复一些文档中的拼写错误和无效链接 ([#155](https://github.com/open-mmlab/mmselfsup/pull/155), [#180](https://github.com/open-mmlab/mmselfsup/pull/180), [#195](https://github.com/open-mmlab/mmselfsup/pull/195))
* 更新模型库里的训练日志和基准测试结果 ([#157](https://github.com/open-mmlab/mmselfsup/pull/157), [#165](https://github.com/open-mmlab/mmselfsup/pull/165), [#195](https://github.com/open-mmlab/mmselfsup/pull/195))
* 更新部分文档并翻译成中文 ([#163](https://github.com/open-mmlab/mmselfsup/pull/163), [#164](https://github.com/open-mmlab/mmselfsup/pull/164), [#165](https://github.com/open-mmlab/mmselfsup/pull/165), [#166](https://github.com/open-mmlab/mmselfsup/pull/166), [#167](https://github.com/open-mmlab/mmselfsup/pull/167), [#168](https://github.com/open-mmlab/mmselfsup/pull/168), [#169](https://github.com/open-mmlab/mmselfsup/pull/169), [#172](https://github.com/open-mmlab/mmselfsup/pull/172), [#176](https://github.com/open-mmlab/mmselfsup/pull/176), [#178](https://github.com/open-mmlab/mmselfsup/pull/178), [#179](https://github.com/open-mmlab/mmselfsup/pull/179))
* 更新算法 README 到新格式 ([#177](https://github.com/open-mmlab/mmselfsup/pull/177))
- 重构文档目录结构 ([#146](https://github.com/open-mmlab/mmselfsup/pull/146))
- 修复 readthedocs ([#148](https://github.com/open-mmlab/mmselfsup/pull/148), [#149](https://github.com/open-mmlab/mmselfsup/pull/149), [#153](https://github.com/open-mmlab/mmselfsup/pull/153))
- 修复一些文档中的拼写错误和无效链接 ([#155](https://github.com/open-mmlab/mmselfsup/pull/155), [#180](https://github.com/open-mmlab/mmselfsup/pull/180), [#195](https://github.com/open-mmlab/mmselfsup/pull/195))
- 更新模型库里的训练日志和基准测试结果 ([#157](https://github.com/open-mmlab/mmselfsup/pull/157), [#165](https://github.com/open-mmlab/mmselfsup/pull/165), [#195](https://github.com/open-mmlab/mmselfsup/pull/195))
- 更新部分文档并翻译成中文 ([#163](https://github.com/open-mmlab/mmselfsup/pull/163), [#164](https://github.com/open-mmlab/mmselfsup/pull/164), [#165](https://github.com/open-mmlab/mmselfsup/pull/165), [#166](https://github.com/open-mmlab/mmselfsup/pull/166), [#167](https://github.com/open-mmlab/mmselfsup/pull/167), [#168](https://github.com/open-mmlab/mmselfsup/pull/168), [#169](https://github.com/open-mmlab/mmselfsup/pull/169), [#172](https://github.com/open-mmlab/mmselfsup/pull/172), [#176](https://github.com/open-mmlab/mmselfsup/pull/176), [#178](https://github.com/open-mmlab/mmselfsup/pull/178), [#179](https://github.com/open-mmlab/mmselfsup/pull/179))
- 更新算法 README 到新格式 ([#177](https://github.com/open-mmlab/mmselfsup/pull/177))
### v0.5.0 (16/12/2021)
#### 亮点
* 代码重构后发版。
* 添加 3 个新的自监督学习算法。
* 支持 MMDet 和 MMSeg 的基准测试。
* 添加全面的文档。
- 代码重构后发版。
- 添加 3 个新的自监督学习算法。
- 支持 MMDet 和 MMSeg 的基准测试。
- 添加全面的文档。
#### 重构
* 合并冗余数据集文件。
* 适配新版 MMCV去除旧版相关代码。
* 继承 MMCV BaseModule。
* 优化目录结构。
* 重命名所有配置文件。
- 合并冗余数据集文件。
- 适配新版 MMCV去除旧版相关代码。
- 继承 MMCV BaseModule。
- 优化目录结构。
- 重命名所有配置文件。
#### 新特性
* 添加 SwAV、SimSiam、DenseCL 算法。
* 添加 t-SNE 可视化工具。
* 支持 MMCV 版本 fp16。
- 添加 SwAV、SimSiam、DenseCL 算法。
- 添加 t-SNE 可视化工具。
- 支持 MMCV 版本 fp16。
#### 基准
* 更多基准测试结果,包括分类、检测和分割。
* 支持下游任务中的一些新数据集。
* 使用 MIM 启动 MMDet 和 MMSeg 训练。
- 更多基准测试结果,包括分类、检测和分割。
- 支持下游任务中的一些新数据集。
- 使用 MIM 启动 MMDet 和 MMSeg 训练。
#### 文档
* 重构 README、getting_started、install、model_zoo 文档。
* 添加数据准备文档。
* 添加全面的教程。
- 重构 README、getting_started、install、model_zoo 文档。
- 添加数据准备文档。
- 添加全面的教程。
## OpenSelfSup (历史)
### v0.3.0 (14/10/2020)
#### 亮点
* 支持混合精度训练。
* 改进 GaussianBlur 使训练速度加倍。
* 更多基准测试结果。
- 支持混合精度训练。
- 改进 GaussianBlur 使训练速度加倍。
- 更多基准测试结果。
#### Bug 修复
* 修复 moco v2 中的 bugs现在结果可复现。
* 修复 byol 中的 bugs。
- 修复 moco v2 中的 bugs现在结果可复现。
- 修复 byol 中的 bugs。
#### 新特性
* 混合精度训练。
* 改进 GaussianBlur 使 MoCo V2、SimCLR、BYOL 的训练速度加倍。
* 更多基准测试结果,包括 Places、VOC、COCO。
- 混合精度训练。
- 改进 GaussianBlur 使 MoCo V2、SimCLR、BYOL 的训练速度加倍。
- 更多基准测试结果,包括 Places、VOC、COCO。
### v0.2.0 (26/6/2020)
#### 亮点
* 支持 BYOL。
* 支持半监督基准测试。
- 支持 BYOL。
- 支持半监督基准测试。
#### Bug 修复
* 修复 publish_model.py 中的哈希 id。
- 修复 publish_model.py 中的哈希 id。
#### 新特性
* 支持 BYOL。
* 在线性和半监督评估中将训练和测试脚本分开。
* 支持半监督基准测试benchmarks/dist_train_semi.sh。
* 将基准测试相关的配置文件移动到 configs/benchmarks/。
* 提供基准测试结果和模型下载链接。
* 支持每隔几次迭代更新网络。
* 支持带有 Nesterov 的 LARS 优化器。
* 支持 SimCLR 和 BYOL 从 LARS 适应和权重衰减中排除特定参数的需求。
- 支持 BYOL。
- 在线性和半监督评估中将训练和测试脚本分开。
- 支持半监督基准测试benchmarks/dist_train_semi.sh。
- 将基准测试相关的配置文件移动到 configs/benchmarks/。
- 提供基准测试结果和模型下载链接。
- 支持每隔几次迭代更新网络。
- 支持带有 Nesterov 的 LARS 优化器。
- 支持 SimCLR 和 BYOL 从 LARS 适应和权重衰减中排除特定参数的需求。

View File

@ -1,71 +0,0 @@
# 参与贡献 OpenMMLab
欢迎各种形式的贡献,包括但不限于以下内容。
- 修复文本错误bug)
- 新的功能和组件
## 工作流程
1. fork 并 pull 最新的 OpenMMLab 仓库 (mmselfsup)
2. 签出到一个新分支(不要使用 master 分支提交 PR
3. 进行修改并提交至 fork 出的自己的远程仓库
4. 在我们的仓库中创建一个 PR
注意:如果你计划添加一些新的功能,并引入大量改动,请尽量首先创建一个 issue 来进行讨论。
## 代码风格
### Python
我们采用 [PEP8](https://www.python.org/dev/peps/pep-0008/) 作为统一的代码风格。
我们使用下列工具来进行代码风格检查与格式化:
- [flake8](http://flake8.pycqa.org/en/latest/): 一个包含了多个代码风格检查工具的封装。
- [yapf](https://github.com/google/yapf): 一个 Python 文件的格式化工具。
- [isort](https://github.com/timothycrosley/isort): 一个对 import 进行排序的 Python 工具。
- [markdownlint](https://github.com/markdownlint/markdownlint): 一个对 markdown 文件进行格式检查与提示的工具。
- [docformatter](https://github.com/myint/docformatter): 一个 docstring 格式化工具。
yapf 和 isort 的格式设置位于 [setup.cfg](../setup.cfg)
我们使用 [pre-commit hook](https://pre-commit.com/) 来保证每次提交时自动进行代
码检查和格式化,启用的功能包括 `flake8`, `yapf`, `isort`, `trailing
whitespaces`, `markdown files`, 修复 `end-of-files`, `double-quoted-strings`,
`python-encoding-pragma`, `mixed-line-ending`, 对 `requirments.txt`的排序等。
pre-commit hook 的配置文件位于 [.pre-commit-config](../.pre-commit-config.yaml)
在你克隆仓库后,你需要按照如下步骤安装并初始化 pre-commit hook。
```shell
pip install -U pre-commit
```
在仓库文件夹中执行
```shell
pre-commit install
```
如果你在安装 markdownlint 的时候遇到问题,请尝试按照以下步骤安装 ruby
```shell
# 安装 rvm
curl -L https://get.rvm.io | bash -s -- --autolibs=read-fail
[[ -s "$HOME/.rvm/scripts/rvm" ]] && source "$HOME/.rvm/scripts/rvm"
rvm autolibs disable
# 安装 ruby
rvm install 2.7.1
```
或者参照 [该仓库](https://github.com/innerlee/setup) 并按照指引执行 [`zzruby.sh`](https://github.com/innerlee/setup/blob/master/zzruby.sh)
在此之后,每次提交,代码规范检查和格式化工具都将被强制执行。
>在创建 PR 之前,请确保你的代码完成了代码规范检查,并经过了 yapf 的格式化。
### C++ 和 CUDA
我们遵照 [Google C++ Style Guide](https://google.github.io/styleguide/cppguide.html)

View File

@ -92,5 +92,6 @@ html_css_files = ['css/readthedocs.css']
# Enable ::: for my_st
myst_enable_extensions = ['colon_fence']
myst_heading_anchors = 4
master_doc = 'index'

21
docs/zh_cn/faq.md 100644
View File

@ -0,0 +1,21 @@
# 常见问题解答
我们在这里列出了使用时的一些常见问题及其相应的解决方案。 如果您发现有一些问题被遗漏,请随时提 PR 丰富这个列表。 如果您无法在此获得帮助,请使用 [issue 模板](https://github.com/open-mmlab/mmselfsup/tree/master/.github/ISSUE_TEMPLATE)创建问题,但是请在模板中填写所有必填信息,这有助于我们更快定位问题。
## 安装相关
下表显示了与 MMSelfSup 适配的 MMCV, MMClassification, MMDetection 和 MMSegmentation 的版本号。 为避免安装过程中出现问题,请参照下表安装适配的版本。
| MMSelfSup version | MMCV version | MMClassification version | MMSegmentation version | MMDetection version |
| :---------------: | :-----------------: | :-------------------------: | :--------------------: | :-----------------: |
| 0.9.1 (master) | mmcv-full >= 1.4.2 | mmcls >= 0.21.0 | mmseg >= 0.20.2 | mmdet >= 2.19.0 |
| 0.9.0 | mmcv-full >= 1.4.2 | mmcls >= 0.21.0 | mmseg >= 0.20.2 | mmdet >= 2.19.0 |
| 0.8.0 | mmcv-full >= 1.4.2 | mmcls >= 0.21.0 | mmseg >= 0.20.2 | mmdet >= 2.19.0 |
| 0.7.1 | mmcv-full >= 1.3.16 | mmcls >= 0.19.0, \<= 0.20.1 | mmseg >= 0.20.2 | mmdet >= 2.16.0 |
| 0.6.0 | mmcv-full >= 1.3.16 | mmcls >= 0.19.0 | mmseg >= 0.20.2 | mmdet >= 2.16.0 |
| 0.5.0 | mmcv-full >= 1.3.16 | / | mmseg >= 0.20.2 | mmdet >= 2.16.0 |
**注意:**
- 如果您已经安装了 mmcv, 您需要运行 `pip uninstall mmcv` 来卸载已经安装的 mmcv。 如果您在本地同时安装了 mmcv 和 mmcv-full, `ModuleNotFoundError` 将会抛出。
- 如过您仍然对版本问题有疑问,欢迎创建 issue 并提供您的依赖库信息。

View File

@ -1,17 +1,17 @@
# 基础教程
- [基础教程](#基础教程)
- [训练已有的算法](#训练已有的算法)
- [使用 CPU 训练](#使用-cpu-训练)
- [使用 单张/多张 显卡训练](#使用-单张多张-显卡训练)
- [使用多台机器训练](#使用多台机器训练)
- [在一台机器上启动多个任务](#在一台机器上启动多个任务)
- [基准测试](#基准测试)
- [工具和建议](#工具和建议)
- [统计模型的参数](#统计模型的参数)
- [发布模型](#发布模型)
- [使用 t-SNE 来做模型可视化](#使用-t-sne-来做模型可视化)
- [可复现性](#可复现性)
- [基础教程](#%E5%9F%BA%E7%A1%80%E6%95%99%E7%A8%8B)
- [训练已有的算法](#%E8%AE%AD%E7%BB%83%E5%B7%B2%E6%9C%89%E7%9A%84%E7%AE%97%E6%B3%95)
- [使用 CPU 训练](#%E4%BD%BF%E7%94%A8-cpu-%E8%AE%AD%E7%BB%83)
- [使用 单张/多张 显卡训练](#%E4%BD%BF%E7%94%A8-%E5%8D%95%E5%BC%A0%E5%A4%9A%E5%BC%A0-%E6%98%BE%E5%8D%A1%E8%AE%AD%E7%BB%83)
- [使用多台机器训练](#%E4%BD%BF%E7%94%A8%E5%A4%9A%E5%8F%B0%E6%9C%BA%E5%99%A8%E8%AE%AD%E7%BB%83)
- [在一台机器上启动多个任务](#%E5%9C%A8%E4%B8%80%E5%8F%B0%E6%9C%BA%E5%99%A8%E4%B8%8A%E5%90%AF%E5%8A%A8%E5%A4%9A%E4%B8%AA%E4%BB%BB%E5%8A%A1)
- [基准测试](#%E5%9F%BA%E5%87%86%E6%B5%8B%E8%AF%95)
- [工具和建议](#%E5%B7%A5%E5%85%B7%E5%92%8C%E5%BB%BA%E8%AE%AE)
- [统计模型的参数](#%E7%BB%9F%E8%AE%A1%E6%A8%A1%E5%9E%8B%E7%9A%84%E5%8F%82%E6%95%B0)
- [发布模型](#%E5%8F%91%E5%B8%83%E6%A8%A1%E5%9E%8B)
- [使用 t-SNE 来做模型可视化](#%E4%BD%BF%E7%94%A8-t-sne-%E6%9D%A5%E5%81%9A%E6%A8%A1%E5%9E%8B%E5%8F%AF%E8%A7%86%E5%8C%96)
- [可复现性](#%E5%8F%AF%E5%A4%8D%E7%8E%B0%E6%80%A7)
本文档提供 MMSelfSup 相关用法的基础教程。 如果您对如何安装 MMSelfSup 以及其相关依赖库有疑问, 请参考 [安装文档](install.md).
@ -149,7 +149,6 @@ python tools/analysis_tools/count_parameters.py ${CONFIG_FILE}
python tools/model_converters/publish_model.py ${INPUT_FILENAME} ${OUTPUT_FILENAME}
```
### 使用 t-SNE 来做模型可视化
我们提供了一个开箱即用的来做图片向量可视化的方法:
@ -165,6 +164,6 @@ python tools/analysis_tools/visualize_tsne.py ${CONFIG_FILE} --checkpoint ${CKPT
- `WORK_DIR`: 保存可视化结果的路径.
- `[optional arguments]`: 可选参数,具体可以参考 [visualize_tsne.py](../../tools/analysis_tools/visualize_tsne.py)
### 可复现性
如果您想确保模型精度的可复现性,您可以设置 `--deterministic` 参数。但是,开启 `--deterministic` 意味着关闭 `torch.backends.cudnn.benchmark`, 所以会使模型的训练速度变慢。

View File

@ -13,7 +13,9 @@ Welcome to MMSelfSup's documentation!
:caption: 开始你的第一步
install.md
getting_started.md
prepare_data.md
get_started.md
model_zoo.md
.. toctree::
:maxdepth: 1
@ -53,8 +55,9 @@ Welcome to MMSelfSup's documentation!
:maxdepth: 1
:caption: 说明
community/CONTRIBUTING.md
changelog.md
compatibility.md
.. toctree::
:caption: 语言切换

View File

@ -1,157 +1,71 @@
# 安装教程
# 前提
## 依赖包
在这一节中,我们展示了如何用 PyTorch 准备环境。
- Linux (Windows is not officially supported)
- Python 3.6+
- PyTorch 1.5+
- CUDA 9.2+
- GCC 5+
- [mmcv](https://github.com/open-mmlab/mmcv) 1.4.2+
- [mmcls](https://mmclassification.readthedocs.io/en/latest/install.html) 0.21.0+
- [mmdet](https://mmdetection.readthedocs.io/en/latest/get_started.html#installation) 2.16.0+
- [mmseg](https://mmsegmentation.readthedocs.io/en/latest/get_started.html#installation) 0.20.2+
MMselfSup 可在 Linux 上运行 (Windows 和 macOS 平台不完全支持)。 要求 Python 3.6+, CUDA 9.2+ 和 PyTorch 1.5+。
下表显示了与 MMSelfSup 适配的 MMCV, MMClassification, MMDetection 和 MMSegmentation 的版本号。 为避免安装过程中出现问题,请参照下表安装适配的版本
如果您对 PyTorch 很熟悉,或者已经安装了它,可以忽略这部分并转到 [下一节](#%E5%AE%89%E8%A3%85) 不然你可以按照下列步骤进行准备。
| MMSelfSup version | MMCV version | MMClassification version | MMSegmentation version | MMDetection version |
| :---------------: | :-----------------: | :------------------------: | :--------------------: | :-----------------: |
| 0.9.0 (master) | mmcv-full >= 1.4.2 | mmcls >= 0.21.0 | mmseg >= 0.20.2 | mmdet >= 2.16.0 |
| 0.8.0 | mmcv-full >= 1.4.2 | mmcls >= 0.21.0 | mmseg >= 0.20.2 | mmdet >= 2.16.0 |
| 0.7.1 | mmcv-full >= 1.3.16 | mmcls >= 0.19.0, <= 0.20.1 | mmseg >= 0.20.2 | mmdet >= 2.16.0 |
| 0.6.0 | mmcv-full >= 1.3.16 | mmcls >= 0.19.0 | mmseg >= 0.20.2 | mmdet >= 2.16.0 |
| 0.5.0 | mmcv-full >= 1.3.16 | / | mmseg >= 0.20.2 | mmdet >= 2.16.0 |
**Step 0.** 从 [官方网址](https://docs.conda.io/en/latest/miniconda.html) 下载并安装 Miniconda。
**注意:**
- 如果您已经安装了 mmcv, 您需要运行 `pip uninstall mmcv` 来卸载已经安装的 mmcv。 如果您在本地同时安装了 mmcv 和 mmcv-full, `ModuleNotFoundError` 将会抛出。
- 由于 MMSelfSup 从 MMClassification 引入了部分网络主干,所以您在使用 MMSelfSup 前必须安装 MMClassification。
- 如果您不需要 MMDetection 和 MMSegmentation 的基准评测,则安装它们不是必须的。
## 配置环境
1. 首先您需要用以下命令安装一个 conda 的虚拟环境,并激活它
```shell
conda create -n openmmlab python=3.7 -y
conda activate openmmlab
```
2. 请参考 [官方教程](https://pytorch.org/) 安装 torch 和 torchvision, 例如您可以使用以下命令:
```shell
conda install pytorch torchvision -c pytorch
```
请确保您的 PyTorch 版本和 CUDA 版本匹配,具体您可以参考 [PyTorch 官网](https://pytorch.org/)。
比如,您在 `/usr/local/cuda` 下安装了 CUDA 10.1,同时您想安装 PyTorch 1.7, 您可以使用以下命令安装适配 CUDA 10.1 的 PyTorch 预编译包。
```shell
conda install pytorch==1.7.0 torchvision==0.8.0 cudatoolkit=10.1 -c pytorch
```
如果您选择从源编译 PyTorch 包,而不是选择预编译包,那么您在 CUDA 版本上拥有更多的选择,比如 9.0。
## 安装 MMSelfSup
1. 安装 MMCV 和 MMClassification
```shell
pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/{cu_version}/{torch_version}/index.html
```
请将上面链接中 `{cu_version}``{torch_version}` 替换成您想要的版本。 比如, 安装最新版本 `mmcv-full`,同时适配 `CUDA 11.0``PyTorch 1.7.x`, 可以使用以下命令:
```shell
pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu110/torch1.7/index.html
```
- PyTorch 在 1.x.0 和 1.x.1 之间通常是兼容的,故 mmcv-full 只提供 1.x.0 的编译包。如果您的 PyTorch 版本是 1.x.1,您可以放心地安装在 1.x.0 版本编译的 mmcv-full。
您可以从 [这里](https://github.com/open-mmlab/mmcv#installation) 查找适配不同 PyTorch 和 CUDA 版本的 MMCV 版本。
除此之外,您可以选择从源编译 MMCV具体请参考 [MMCV安装文档](https://github.com/open-mmlab/mmcv#installation)。
您可以使用以下命令安装 MMClassification
```shell
pip install mmcls
```
2. 克隆 MMSelfSup 并且安装
```shell
git clone https://github.com/open-mmlab/mmselfsup.git
cd mmselfsup
pip install -v -e .
```
**注意:**
- 当您指定 `-e``develop`参数, MMSelfSup 采用开发者安装模式, 任何改动将会立即生效,而无需重新安装。
3. 安装 MMSegmentation 和 MMDetection
您可以使用以下命令安装 MMSegmentation 和 MMDetection:
```shell
pip install mmsegmentation mmdet
```
除了使用 pip 安装 MMSegmentation 和 MMDetection, 您也可以使用 [mim](https://github.com/open-mmlab/mim), 例如:
```shell
pip install openmim
mim install mmdet
mim install mmsegmentation
```
## 从零开始安装脚本
下面脚本提供了使用 conda 端到端安装 MMSelfSup 的所有命令。
**Step 1.** 创建 conda 环境并激活
```shell
conda create -n openmmlab python=3.7 -y
conda create --name openmmlab python=3.8 -y
conda activate openmmlab
```
conda install -c pytorch pytorch torchvision -y
**Step 2.** 按照 [官方教程](https://pytorch.org/get-started/locally/) 安装 PyTorch 例如
# install the latest mmcv
pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.7.0/index.html
GPU 平台:
# install mmdetection mmsegmentation
pip install mmsegmentation mmdet
```shell
conda install pytorch torchvision -c pytorch
```
CPU 平台:
```shell
conda install pytorch torchvision cpuonly -c pytorch
```
# 安装
我们推荐用户按照我们的最优方案来安装 MMSelfSup不过整体流程也可以是自定义的 可参考 [自定义安装](#%E8%87%AA%E5%AE%9A%E4%B9%89%E5%AE%89%E8%A3%85) 章节
## 最优方案
**Step 0.** 使用 [MIM](https://github.com/open-mmlab/mim) 安装 [MMCV](https://github.com/open-mmlab/mmcv)。
```shell
pip install -U openmim
mim install mmcv-full
```
**Step 1.** 安装 MMSelfSup.
实例 a: 如果您直接或者开发 MMSelfSup 从源安装:
```shell
git clone https://github.com/open-mmlab/mmselfsup.git
cd mmselfsup
pip install -v -e .
# "-v" means verbose, or more output
# "-e" means installing a project in editable mode,
# thus any local modifications made to the code will take effect without reinstallation.
```
## 另一种选择: 使用 Docker
我们提供了一个配置好所有环境的 [Dockerfile](/docker/Dockerfile)。
实例 b: 如果您以 mmselfsup 为依赖项或者第三方库, 可使用 pip 安装:
```shell
# build an image with PyTorch 1.6.0, CUDA 10.1, CUDNN 7.
docker build -f ./docker/Dockerfile --rm -t mmselfsup:torch1.10.0-cuda11.3-cudnn8 .
pip install mmselfsup
```
**重要:** 请确保您安装了 [nvidia-container-toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker)。
运行下面命令:
```shell
docker run --gpus all --shm-size=8g -it -v {DATA_DIR}:/workspace/mmselfsup/data mmselfsup:torch1.10.0-cuda11.3-cudnn8 /bin/bash
```
`{DATA_DIR}` 是保存你所有数据集的根目录。
## 安装校验
走完上面的步骤,为了确保您正确安装了 MMSelfSup 以及其各种依赖库,请使用下面脚本来完成校验:
```py
```python
import torch
from mmselfsup.models import build_algorithm
@ -182,7 +96,103 @@ loss = model.forward_train(image, label)
如果您能顺利运行上面脚本,恭喜您已成功配置好所有环境。
## 使用不同版本的 MMSelfSup
## 自定义安装
### 基准测试
依照 [最优方案](#%E6%9C%80%E4%BC%98%E6%96%B9%E6%A1%88) 可以保证基本功能, 如果您需要一些下游任务来对您的预训练模型进行评测,例如检测或者分割, 请安装 [MMDetection](https://github.com/open-mmlab/mmdetection) 和 [MMSegmentation](https://github.com/open-mmlab/mmsegmentation)。
如果您不运行 MMDetection 和 MMSegmentation 基准测试, 可以不进行安装。
您可以使用以下命令进行安装:
```shell
pip install mmdet mmsegmentation
```
若需要更详细的信息, 您可以参考 [MMDetection](https://github.com/open-mmlab/mmdetection/blob/master/docs/zh_cn/get_started.md) 和 [MMSegmentation](https://github.com/open-mmlab/mmsegmentation/blob/master/docs/zh_cn/get_started.md) 的安装指导页面。
### CUDA 版本
在安装 PyTorch 时, 您需要确认 CUDA 版本。 若您对此不清楚,可以按照我们的建议:
- 对于安培架构的 NVIDIA GPUs 例如 GeForce 30 系列或者 NVIDIA A100, CUDA 11 是必须的。
- 对于较老版本的 NVIDIA GPUs CUDA 11 是兼容的, 但是 CUDA 10.2 具有更好的兼容性以及更加轻量化。
请确认您的 GPU 驱动满足最小版本需求。 请参考 [此表](https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html#cuda-major-component-versions__table-cuda-toolkit-driver-versions) 获取更多信息。
```{note}
如果您按照我们的最优方案安装 CUDA runtime 库是足够的,因为本地不会编译 CUDA 代码。但是如果您希望从源编译 MMCV 或开发其它 CUDA 算子, 您需要安装完整的 CUDA 工具包, 从 NVIDIA 的网站https://developer.nvidia.com/cuda-downloads并它的版本需要和 PyTorch 的 CUDA 版本相匹配。如准确的 cudatoolkit 版本 在 `conda install` 命令中。
```
### 不使用 MIM 安装 MMCV
MMCV 包含了 C++ 和 CUDA 扩展, 因此以一种复杂的方式依赖于 PyTorch。 MIM 自动解决了这种依赖关系,并使安装更加容易,然而,这不是必须的。
使用 pip 安装 MMCV 而不是 MIM, 请参考 [MMCV 安装指南](https://mmcv.readthedocs.io/en/latest/get_started/installation.html)。 这需要根据 PyTorch 版本及其 CUDA 版本手动指定一个链接。
例如, 下列命令安装了 mmcv-full 基于 PyTorch 1.10.x 和 CUDA 11.3。
```shell
pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu113/torch1.10/index.html
```
### 另一种选择: 使用 Docker
我们提供了一个配置好所有环境的 [Dockerfile](/docker/Dockerfile)。
```shell
# build an image with PyTorch 1.6.0, CUDA 10.1, CUDNN 7.
docker build -f ./docker/Dockerfile --rm -t mmselfsup:torch1.10.0-cuda11.3-cudnn8 .
```
**重要:** 请确保您安装了 [nvidia-container-toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker)。
运行下面命令:
```shell
docker run --gpus all --shm-size=8g -it -v {DATA_DIR}:/workspace/mmselfsup/data mmselfsup:torch1.10.0-cuda11.3-cudnn8 /bin/bash
```
`{DATA_DIR}` 是保存你所有数据集的根目录。
### 在 Google Colab 上安装
[Google Colab](https://research.google.com/) 一般已经安装了 PyTorch 因此,我们只需要使用以下命令安装 MMCV 和 MMSeflSup。
**Step 0.** 使用 [MIM](https://github.com/open-mmlab/mim) 安装 [MMCV](https://github.com/open-mmlab/mmcv)。
```shell
!pip3 install openmim
!mim install mmcv-full
```
**Step 1.** 安装 MMSelfSup
```shell
!git clone https://github.com/open-mmlab/mmselfsup.git
%cd mmselfsup
!pip install -e .
```
**Step 2.** 安装校验
```python
import mmselfsup
print(mmselfsup.__version__)
# Example output: 0.9.0
```
```{note}
Within Jupyter, the exclamation mark `!` is used to call external executables and `%cd` is a [magic command](https://ipython.readthedocs.io/en/stable/interactive/magics.html#magic-cd) to change the current working directory of Python.
```
## 问题解答
如果您在安装过程中遇到了什么问题, 请先查阅 [FAQ](faq.md) 页面.
您也可以在 GitHub [创建 issue](https://github.com/open-mmlab/mmselfsup/issues/new/choose) 如果您没找到答案。
# 使用不同版本的 MMSelfSup
如果在您本地安装了多个版本的 MMSelfSup, 我们推荐您为这多个版本创建不同的虚拟环境。

View File

@ -4,27 +4,28 @@
## 预训练模型
| 算法 | 配置文件 | 下载链接 |
| ------------------------------------------------------------------------------------------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| [Relative Location](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/relative_loc/README.md) | [relative-loc_resnet50_8xb64-steplr-70e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/relative_loc/relative-loc_resnet50_8xb64-steplr-70e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/relative_loc/relative-loc_resnet50_8xb64-steplr-70e_in1k_20220225-84784688.pth) &#124; [log](https://download.openmmlab.com/mmselfsup/relative_loc/relative-loc_resnet50_8xb64-steplr-70e_in1k_20220211_124808.log.json) |
| [Rotation Prediction](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/rotation_pred/README.md) | [rotation-pred_resnet50_8xb16-steplr-70e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/rotation_pred/rotation-pred_resnet50_8xb16-steplr-70e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/rotation_pred/rotation-pred_resnet50_8xb16-steplr-70e_in1k_20220225-5b9f06a0.pth) &#124; [log](https://download.openmmlab.com/mmselfsup/rotation_pred/rotation-pred_resnet50_8xb16-steplr-70e_in1k_20220215_185303.log.json) |
| [DeepCluster](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/deepcluster/README.md) | [deepcluster-sobel_resnet50_8xb64-steplr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/deepcluster/deepcluster-sobel_resnet50_8xb64-steplr-200e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/deepcluster/deepcluster-sobel_resnet50_8xb64-steplr-200e_in1k-bb8681e2.pth) |
| [NPID](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/npid/README.md) | [npid_resnet50_8xb32-steplr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/npid/npid_resnet50_8xb32-steplr-200e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/npid/npid_resnet50_8xb32-steplr-200e_in1k_20220225-5fbbda2a.pth) &#124; [log](https://download.openmmlab.com/mmselfsup/npid/npid_resnet50_8xb32-steplr-200e_in1k_20220215_185513.log.json) |
| [ODC](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/odc/README.md) | [odc_resnet50_8xb64-steplr-440e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/odc/odc_resnet50_8xb64-steplr-440e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/odc/odc_resnet50_8xb64-steplr-440e_in1k_20220225-a755d9c0.pth) &#124; [log](https://download.openmmlab.com/mmselfsup/odc/odc_resnet50_8xb64-steplr-440e_in1k_20220215_235245.log.json) |
| [SimCLR](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simclr/README.md) | [simclr_resnet50_8xb32-coslr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simclr/simclr_resnet50_8xb32-coslr-200e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/simclr/simclr_resnet50_8xb32-coslr-200e_in1k_20220428-46ef6bb9.pth) &#124; [log](https://download.openmmlab.com/mmselfsup/simclr/simclr_resnet50_8xb32-coslr-200e_in1k_20220411_182427.log.json) |
| | [simclr_resnet50_16xb256-coslr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simclr/simclr_resnet50_16xb256-coslr-200e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/simclr/simclr_resnet50_16xb256-coslr-200e_in1k_20220428-8c24b063.pth) &#124; [log](https://download.openmmlab.com/mmselfsup/simclr/simclr_resnet50_16xb256-coslr-200e_in1k_20220423_205520.log.json) |
| [MoCo v2](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mocov2/README.md) | [mocov2_resnet50_8xb32-coslr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mocov2/mocov2_resnet50_8xb32-coslr-200e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/moco/mocov2_resnet50_8xb32-coslr-200e_in1k_20220225-89e03af4.pth) &#124; [log](https://download.openmmlab.com/mmselfsup/moco/mocov2_resnet50_8xb32-coslr-200e_in1k_20220210_110905.log.json) |
| [BYOL](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/byol/README.md) | [byol_resnet50_8xb32-accum16-coslr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/byol/byol_resnet50_8xb32-accum16-coslr-200e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/byol/byol_resnet50_8xb32-accum16-coslr-200e_in1k_20220225-5c8b2c2e.pth) &#124; [log](https://download.openmmlab.com/mmselfsup/byol/byol_resnet50_8xb32-accum16-coslr-200e_in1k_20220214_115709.log.json) |
| | [byol_resnet50_8xb32-accum16-coslr-300e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/byol/byol_resnet50_8xb32-accum16-coslr-300e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/byol/byol_resnet50_8xb32-accum16-coslr-300e_in1k_20220225-a0daa54a.pth) &#124; [log](https://download.openmmlab.com/mmselfsup/byol/byol_resnet50_8xb32-accum16-coslr-300e_in1k_20220210_095852.log.json) |
| [SwAV](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/swav/README.md) | [swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/swav/swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96.py) | [model](https://download.openmmlab.com/mmselfsup/swav/swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96_20220225-0497dd5d.pth) &#124; [log](https://download.openmmlab.com/mmselfsup/swav/swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96_20220211_061131.log.json) |
| [DenseCL](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/densecl/README.md) | [densecl_resnet50_8xb32-coslr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/densecl/densecl_resnet50_8xb32-coslr-200e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/densecl/densecl_resnet50_8xb32-coslr-200e_in1k_20220225-8c7808fe.pth) &#124; [log](https://download.openmmlab.com/mmselfsup/densecl/densecl_resnet50_8xb32-coslr-200e_in1k_20220215_041207.log.json) |
| [SimSiam](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simsiam/README.md) | [simsiam_resnet50_8xb32-coslr-100e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simsiam/simsiam_resnet50_8xb32-coslr-100e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/simsiam/simsiam_resnet50_8xb32-coslr-100e_in1k_20220225-68a88ad8.pth) &#124; [log](https://download.openmmlab.com/mmselfsup/simsiam/simsiam_resnet50_8xb32-coslr-100e_in1k_20220210_195405.log.json) |
| | [simsiam_resnet50_8xb32-coslr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simsiam/simsiam_resnet50_8xb32-coslr-200e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/simsiam/simsiam_resnet50_8xb32-coslr-200e_in1k_20220225-2f488143.pth) &#124; [log](https://download.openmmlab.com/mmselfsup/simsiam/simsiam_resnet50_8xb32-coslr-200e_in1k_20220210_195402.log.json) |
| [BarlowTwins](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/barlowtwins/README.md) | [barlowtwins_resnet50_8xb256-coslr-300e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/barlowtwins/barlowtwins_resnet50_8xb256-coslr-300e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/barlowtwins/barlowtwins_resnet50_8xb256-coslr-300e_in1k_20220419-5ae15f89.pth) &#124; [log](https://download.openmmlab.com/mmselfsup/barlowtwins/barlowtwins_resnet50_8xb256-coslr-300e_in1k_20220413_111555.log.json) |
| [MoCo v3](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mocov3/README.md) | [mocov3_vit-small-p16_32xb128-fp16-coslr-300e_in1k-224](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mocov3/mocov3_vit-small-p16_32xb128-fp16-coslr-300e_in1k-224.py) | [model](https://download.openmmlab.com/mmselfsup/moco/mocov3_vit-small-p16_32xb128-fp16-coslr-300e_in1k-224_20220225-e31238dd.pth) &#124; [log](https://download.openmmlab.com/mmselfsup/moco/mocov3_vit-small-p16_32xb128-fp16-coslr-300e_in1k-224_20220222_160222.log.json) |
| [MAE](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mae/README.md) | [mae_vit-base-p16_8xb512-coslr-400e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mae/mae_vit-base-p16_8xb512-coslr-400e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/mae/mae_vit-base-p16_8xb512-coslr-400e_in1k-224_20220223-85be947b.pth) &#124; [log](https://download.openmmlab.com/mmselfsup/mae/mae_vit-base-p16_8xb512-coslr-300e_in1k-224_20220210_140925.log.json) |
| [SimMIM](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simmim/README.md) | [simmim_swin-base_16xb128-coslr-100e_in1k-192](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simmim/simmim_swin-base_16xb128-coslr-100e_in1k-192.py) | [model](https://download.openmmlab.com/mmselfsup/simmim/simmim_swin-base_16xb128-coslr-100e_in1k-192_20220316-1d090125.pth) &#124; [log](https://download.openmmlab.com/mmselfsup/simmim/simmim_swin-base_16xb128-coslr-100e_in1k-192_20220316-1d090125.log.json) |
| [CAE](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simmim/README.md) | [cae_vit-base-p16_8xb256-fp16-coslr-300e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/cae/cae_vit-base-p16_8xb256-fp16-coslr-300e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/cae/cae_vit-base-p16_16xb256-coslr-300e_in1k-224_20220427-4c786349.pth) &#124; [log](https://download.openmmlab.com/mmselfsup/cae/cae_vit-base-p16_16xb256-coslr-300e_in1k-224_20220427-4c786349.log.json) |
| 算法 | 配置文件 | 下载链接 |
| ------------------------------------------------------------------------------------------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| [Relative Location](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/relative_loc/README.md) | [relative-loc_resnet50_8xb64-steplr-70e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/relative_loc/relative-loc_resnet50_8xb64-steplr-70e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/relative_loc/relative-loc_resnet50_8xb64-steplr-70e_in1k_20220225-84784688.pth) \| [log](https://download.openmmlab.com/mmselfsup/relative_loc/relative-loc_resnet50_8xb64-steplr-70e_in1k_20220211_124808.log.json) |
| [Rotation Prediction](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/rotation_pred/README.md) | [rotation-pred_resnet50_8xb16-steplr-70e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/rotation_pred/rotation-pred_resnet50_8xb16-steplr-70e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/rotation_pred/rotation-pred_resnet50_8xb16-steplr-70e_in1k_20220225-5b9f06a0.pth) \| [log](https://download.openmmlab.com/mmselfsup/rotation_pred/rotation-pred_resnet50_8xb16-steplr-70e_in1k_20220215_185303.log.json) |
| [DeepCluster](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/deepcluster/README.md) | [deepcluster-sobel_resnet50_8xb64-steplr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/deepcluster/deepcluster-sobel_resnet50_8xb64-steplr-200e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/deepcluster/deepcluster-sobel_resnet50_8xb64-steplr-200e_in1k-bb8681e2.pth) |
| [NPID](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/npid/README.md) | [npid_resnet50_8xb32-steplr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/npid/npid_resnet50_8xb32-steplr-200e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/npid/npid_resnet50_8xb32-steplr-200e_in1k_20220225-5fbbda2a.pth) \| [log](https://download.openmmlab.com/mmselfsup/npid/npid_resnet50_8xb32-steplr-200e_in1k_20220215_185513.log.json) |
| [ODC](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/odc/README.md) | [odc_resnet50_8xb64-steplr-440e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/odc/odc_resnet50_8xb64-steplr-440e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/odc/odc_resnet50_8xb64-steplr-440e_in1k_20220225-a755d9c0.pth) \| [log](https://download.openmmlab.com/mmselfsup/odc/odc_resnet50_8xb64-steplr-440e_in1k_20220215_235245.log.json) |
| [SimCLR](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simclr/README.md) | [simclr_resnet50_8xb32-coslr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simclr/simclr_resnet50_8xb32-coslr-200e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/simclr/simclr_resnet50_8xb32-coslr-200e_in1k_20220428-46ef6bb9.pth) \| [log](https://download.openmmlab.com/mmselfsup/simclr/simclr_resnet50_8xb32-coslr-200e_in1k_20220411_182427.log.json) |
| | [simclr_resnet50_16xb256-coslr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simclr/simclr_resnet50_16xb256-coslr-200e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/simclr/simclr_resnet50_16xb256-coslr-200e_in1k_20220428-8c24b063.pth) \| [log](https://download.openmmlab.com/mmselfsup/simclr/simclr_resnet50_16xb256-coslr-200e_in1k_20220423_205520.log.json) |
| [MoCo v2](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mocov2/README.md) | [mocov2_resnet50_8xb32-coslr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mocov2/mocov2_resnet50_8xb32-coslr-200e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/moco/mocov2_resnet50_8xb32-coslr-200e_in1k_20220225-89e03af4.pth) \| [log](https://download.openmmlab.com/mmselfsup/moco/mocov2_resnet50_8xb32-coslr-200e_in1k_20220210_110905.log.json) |
| [BYOL](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/byol/README.md) | [byol_resnet50_8xb32-accum16-coslr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/byol/byol_resnet50_8xb32-accum16-coslr-200e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/byol/byol_resnet50_8xb32-accum16-coslr-200e_in1k_20220225-5c8b2c2e.pth) \| [log](https://download.openmmlab.com/mmselfsup/byol/byol_resnet50_8xb32-accum16-coslr-200e_in1k_20220214_115709.log.json) |
| | [byol_resnet50_16xb256-coslr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/byol/byol_resnet50_16xb256-coslr-200e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/byol/byol_resnet50_16xb256-coslr-200e_in1k_20220527-b6f8eedd.pth) \| [log](https://download.openmmlab.com/mmselfsup/byol/byol_resnet50_16xb256-coslr-200e_in1k_20220509_213052.log.json) |
| | [byol_resnet50_8xb32-accum16-coslr-300e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/byol/byol_resnet50_8xb32-accum16-coslr-300e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/byol/byol_resnet50_8xb32-accum16-coslr-300e_in1k_20220225-a0daa54a.pth) \| [log](https://download.openmmlab.com/mmselfsup/byol/byol_resnet50_8xb32-accum16-coslr-300e_in1k_20220210_095852.log.json) |
| [SwAV](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/swav/README.md) | [swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/swav/swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96.py) | [model](https://download.openmmlab.com/mmselfsup/swav/swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96_20220225-0497dd5d.pth) \| [log](https://download.openmmlab.com/mmselfsup/swav/swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96_20220211_061131.log.json) |
| [DenseCL](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/densecl/README.md) | [densecl_resnet50_8xb32-coslr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/densecl/densecl_resnet50_8xb32-coslr-200e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/densecl/densecl_resnet50_8xb32-coslr-200e_in1k_20220225-8c7808fe.pth) \| [log](https://download.openmmlab.com/mmselfsup/densecl/densecl_resnet50_8xb32-coslr-200e_in1k_20220215_041207.log.json) |
| [SimSiam](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simsiam/README.md) | [simsiam_resnet50_8xb32-coslr-100e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simsiam/simsiam_resnet50_8xb32-coslr-100e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/simsiam/simsiam_resnet50_8xb32-coslr-100e_in1k_20220225-68a88ad8.pth) \| [log](https://download.openmmlab.com/mmselfsup/simsiam/simsiam_resnet50_8xb32-coslr-100e_in1k_20220210_195405.log.json) |
| | [simsiam_resnet50_8xb32-coslr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simsiam/simsiam_resnet50_8xb32-coslr-200e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/simsiam/simsiam_resnet50_8xb32-coslr-200e_in1k_20220225-2f488143.pth) \| [log](https://download.openmmlab.com/mmselfsup/simsiam/simsiam_resnet50_8xb32-coslr-200e_in1k_20220210_195402.log.json) |
| [BarlowTwins](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/barlowtwins/README.md) | [barlowtwins_resnet50_8xb256-coslr-300e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/barlowtwins/barlowtwins_resnet50_8xb256-coslr-300e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/barlowtwins/barlowtwins_resnet50_8xb256-coslr-300e_in1k_20220419-5ae15f89.pth) \| [log](https://download.openmmlab.com/mmselfsup/barlowtwins/barlowtwins_resnet50_8xb256-coslr-300e_in1k_20220413_111555.log.json) |
| [MoCo v3](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mocov3/README.md) | [mocov3_vit-small-p16_32xb128-fp16-coslr-300e_in1k-224](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mocov3/mocov3_vit-small-p16_32xb128-fp16-coslr-300e_in1k-224.py) | [model](https://download.openmmlab.com/mmselfsup/moco/mocov3_vit-small-p16_32xb128-fp16-coslr-300e_in1k-224_20220225-e31238dd.pth) \| [log](https://download.openmmlab.com/mmselfsup/moco/mocov3_vit-small-p16_32xb128-fp16-coslr-300e_in1k-224_20220222_160222.log.json) |
| [MAE](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mae/README.md) | [mae_vit-base-p16_8xb512-coslr-400e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mae/mae_vit-base-p16_8xb512-coslr-400e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/mae/mae_vit-base-p16_8xb512-coslr-400e_in1k-224_20220223-85be947b.pth) \| [log](https://download.openmmlab.com/mmselfsup/mae/mae_vit-base-p16_8xb512-coslr-300e_in1k-224_20220210_140925.log.json) |
| [SimMIM](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simmim/README.md) | [simmim_swin-base_16xb128-coslr-100e_in1k-192](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simmim/simmim_swin-base_16xb128-coslr-100e_in1k-192.py) | [model](https://download.openmmlab.com/mmselfsup/simmim/simmim_swin-base_16xb128-coslr-100e_in1k-192_20220316-1d090125.pth) \| [log](https://download.openmmlab.com/mmselfsup/simmim/simmim_swin-base_16xb128-coslr-100e_in1k-192_20220316-1d090125.log.json) |
| [CAE](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simmim/README.md) | [cae_vit-base-p16_8xb256-fp16-coslr-300e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/cae/cae_vit-base-p16_8xb256-fp16-coslr-300e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/cae/cae_vit-base-p16_16xb256-coslr-300e_in1k-224_20220427-4c786349.pth) \| [log](https://download.openmmlab.com/mmselfsup/cae/cae_vit-base-p16_16xb256-coslr-300e_in1k-224_20220427-4c786349.log.json) |
备注:
@ -40,37 +41,39 @@
如果没有特殊说明,下列实验采用 [MoCo](http://openaccess.thecvf.com/content_CVPR_2020/papers/He_Momentum_Contrast_for_Unsupervised_Visual_Representation_Learning_CVPR_2020_paper.pdf) 的设置,或者采用的训练设置写在备注中。
| 算法 | 配置文件 | 备注 | Top-1 (%) |
| ------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------- | --------- |
| Relative Location | [relative-loc_resnet50_8xb64-steplr-70e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/relative_loc/relative-loc_resnet50_8xb64-steplr-70e_in1k.py) | | 38.78 |
| Rotation Prediction | [rotation-pred_resnet50_8xb16-steplr-70e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/rotation_pred/rotation-pred_resnet50_8xb16-steplr-70e_in1k.py) | | 48.12 |
| DeepCluster | [deepcluster-sobel_resnet50_8xb64-steplr-200e_in1k.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/deepcluster/deepcluster-sobel_resnet50_8xb64-steplr-200e_in1k.py) | | 46.92 |
| NPID | [npid_resnet50_8xb32-steplr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/npid/npid_resnet50_8xb32-steplr-200e_in1k.py) | | 58.97 |
| ODC | [odc_resnet50_8xb64-steplr-440e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/odc/odc_resnet50_8xb64-steplr-440e_in1k.py) | | 53.43 |
| SimCLR | [simclr_resnet50_8xb32-coslr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simclr/simclr_resnet50_8xb32-coslr-200e_in1k.py) | SimSiam 论文设置 | 62.56 |
| | [simclr_resnet50_16xb256-coslr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simclr/simclr_resnet50_16xb256-coslr-200e_in1k.py) | SimSiam 论文设置 | 66.66 |
| MoCo v2 | [mocov2_resnet50_8xb32-coslr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mocov2/mocov2_resnet50_8xb32-coslr-200e_in1k.py) | | 67.58 |
| BYOL | [byol_resnet50_8xb32-accum16-coslr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/byol/byol_resnet50_8xb32-accum16-coslr-200e_in1k.py) | | 67.55 |
| | [byol_resnet50_8xb32-accum16-coslr-300e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/byol/byol_resnet50_8xb32-accum16-coslr-300e_in1k.py) | | 68.55 |
| SwAV | [swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/swav/swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96.py) | SwAV 论文设置 | 70.47 |
| DenseCL | [densecl_resnet50_8xb32-coslr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/densecl/densecl_resnet50_8xb32-coslr-200e_in1k.py) | | 63.62 |
| SimSiam | [simsiam_resnet50_8xb32-coslr-100e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simsiam/simsiam_resnet50_8xb32-coslr-100e_in1k.py) | SimSiam 论文设置 | 68.28 |
| | [simsiam_resnet50_8xb32-coslr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simsiam/simsiam_resnet50_8xb32-coslr-200e_in1k.py) | SimSiam 论文设置 | 69.84 |
| Barlow Twins | [barlowtwins_resnet50_8xb256-coslr-300e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/barlowtwins/barlowtwins_resnet50_8xb256-coslr-300e_in1k.py) | Barlow Twins 论文设置 | 71.66 |
| MoCo v3 | [mocov3_vit-small-p16_32xb128-fp16-coslr-300e_in1k-224](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mocov3/mocov3_vit-small-p16_32xb128-fp16-coslr-300e_in1k-224.py) | MoCo v3 论文设置 | 73.19 |
| 算法 | 配置文件 | 备注 | Top-1 (%) |
| ------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------- | --------- |
| Relative Location | [relative-loc_resnet50_8xb64-steplr-70e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/relative_loc/relative-loc_resnet50_8xb64-steplr-70e_in1k.py) | | 38.78 |
| Rotation Prediction | [rotation-pred_resnet50_8xb16-steplr-70e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/rotation_pred/rotation-pred_resnet50_8xb16-steplr-70e_in1k.py) | | 48.12 |
| DeepCluster | [deepcluster-sobel_resnet50_8xb64-steplr-200e_in1k.py](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/deepcluster/deepcluster-sobel_resnet50_8xb64-steplr-200e_in1k.py) | | 46.92 |
| NPID | [npid_resnet50_8xb32-steplr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/npid/npid_resnet50_8xb32-steplr-200e_in1k.py) | | 58.97 |
| ODC | [odc_resnet50_8xb64-steplr-440e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/odc/odc_resnet50_8xb64-steplr-440e_in1k.py) | | 53.43 |
| SimCLR | [simclr_resnet50_8xb32-coslr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simclr/simclr_resnet50_8xb32-coslr-200e_in1k.py) | SimSiam 论文设置 | 62.56 |
| | [simclr_resnet50_16xb256-coslr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simclr/simclr_resnet50_16xb256-coslr-200e_in1k.py) | SimSiam 论文设置 | 66.66 |
| MoCo v2 | [mocov2_resnet50_8xb32-coslr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mocov2/mocov2_resnet50_8xb32-coslr-200e_in1k.py) | | 67.58 |
| BYOL | [byol_resnet50_8xb32-accum16-coslr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/byol/byol_resnet50_8xb32-accum16-coslr-200e_in1k.py) | SimSiam 论文设置 | 71.72 |
| | [byol_resnet50_16xb256-coslr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/byol/byol_resnet50_16xb256-coslr-200e_in1k.py) | SimSiam 论文设置 | 71.88 |
| | [byol_resnet50_8xb32-accum16-coslr-300e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/byol/byol_resnet50_8xb32-accum16-coslr-300e_in1k.py) | SimSiam 论文设置 | 72.93 |
| SwAV | [swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/swav/swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96.py) | SwAV 论文设置 | 70.47 |
| DenseCL | [densecl_resnet50_8xb32-coslr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/densecl/densecl_resnet50_8xb32-coslr-200e_in1k.py) | | 63.62 |
| SimSiam | [simsiam_resnet50_8xb32-coslr-100e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simsiam/simsiam_resnet50_8xb32-coslr-100e_in1k.py) | SimSiam 论文设置 | 68.28 |
| | [simsiam_resnet50_8xb32-coslr-200e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simsiam/simsiam_resnet50_8xb32-coslr-200e_in1k.py) | SimSiam 论文设置 | 69.84 |
| Barlow Twins | [barlowtwins_resnet50_8xb256-coslr-300e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/barlowtwins/barlowtwins_resnet50_8xb256-coslr-300e_in1k.py) | Barlow Twins 论文设置 | 71.66 |
| MoCo v3 | [mocov3_vit-small-p16_32xb128-fp16-coslr-300e_in1k-224](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mocov3/mocov3_vit-small-p16_32xb128-fp16-coslr-300e_in1k-224.py) | MoCo v3 论文设置 | 73.19 |
### ImageNet 微调
| 算法 | 配置文件 | 备注 | Top-1 (%) |
| ---- | ------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---- | --------- |
| MAE | [mae_vit-base-p16_8xb512-coslr-400e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mae/mae_vit-base-p16_8xb512-coslr-400e_in1k.py) | | 83.1 |
| SimMIM | [simmim_swin-base_16xb128-coslr-100e_in1k-192](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simmim/simmim_swin-base_16xb128-coslr-100e_in1k-192.py) | | 82.9 |
| CAE | [cae_vit-base-p16_8xb256-fp16-coslr-300e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/cae/cae_vit-base-p16_8xb256-fp16-coslr-300e_in1k.py) | | 83.2 |
| 算法 | 配置文件 | 备注 | Top-1 (%) |
| ------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --- | --------- |
| MAE | [mae_vit-base-p16_8xb512-coslr-400e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mae/mae_vit-base-p16_8xb512-coslr-400e_in1k.py) | | 83.1 |
| SimMIM | [simmim_swin-base_16xb128-coslr-100e_in1k-192](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/simmim/simmim_swin-base_16xb128-coslr-100e_in1k-192.py) | | 82.9 |
| CAE | [cae_vit-base-p16_8xb256-fp16-coslr-300e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/cae/cae_vit-base-p16_8xb256-fp16-coslr-300e_in1k.py) | | 83.2 |
### COCO17 目标检测和实例分割
在 COCO17 数据集的目标检测和实例分割任务中,我们选用 [MoCo](http://openaccess.thecvf.com/content_CVPR_2020/papers/He_Momentum_Contrast_for_Unsupervised_Visual_Representation_Learning_CVPR_2020_paper.pdf) 的评估设置,基于 Mask-RCNN FPN 网络架构,下列结果通过同样的 [配置文件](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmdetection/coco/mask_rcnn_r50_fpn_mstrain_1x_coco.py) 训练得到。
| 算法 | 配置文件 | mAP (Box) | mAP (Mask) |
| 算法 | 配置文件 | mAP (Box) | mAP (Mask) |
| ------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------- | ---------- |
| Relative Location | [relative-loc_resnet50_8xb64-steplr-70e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/relative_loc/relative-loc_resnet50_8xb64-steplr-70e_in1k.py) | 37.5 | 33.7 |
| Rotation Prediction | [rotation-pred_resnet50_8xb16-steplr-70e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/rotation_pred/rotation-pred_resnet50_8xb16-steplr-70e_in1k.py) | 37.9 | 34.2 |
@ -86,7 +89,7 @@
在 Pascal VOC12 Aug 语义分割任务中,我们选用 [MMSeg](https://github.com/open-mmlab/mmsegmentation) 的评估设置, 基于 FCN 网络架构, 下列结果通过同样的 [配置文件](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/mmsegmentation/voc12aug/fcn_r50-d8_512x512_20k_voc12aug.py) 训练得到。
| 算法 | 配置文件 | mIOU |
| 算法 | 配置文件 | mIOU |
| ------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----- |
| Relative Location | [relative-loc_resnet50_8xb64-steplr-70e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/relative_loc/relative-loc_resnet50_8xb64-steplr-70e_in1k.py) | 63.49 |
| Rotation Prediction | [rotation-pred_resnet50_8xb16-steplr-70e_in1k](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/rotation_pred/rotation-pred_resnet50_8xb16-steplr-70e_in1k.py) | 64.31 |

View File

@ -2,14 +2,14 @@
MMSelfSup 支持多个数据集。请遵循相应的数据准备指南。建议将您的数据集根目录软链接到 `$MMSELFSUP/data`。如果您的文件夹结构不同,您可能需要更改配置文件中的相应路径。
- [准备 ImageNet 数据集](#准备-imagenet-数据集)
- [准备 Places205 数据集](#准备-places205-数据集)
- [准备 iNaturalist2018 数据集](#准备-inaturalist2018-数据集)
- [准备 PASCAL VOC 数据集](#准备-pascal-voc-数据集)
- [准备 CIFAR10 数据集](#准备-cifar10-数据集)
- [准备检测和分割数据集](#准备检测和分割数据集)
- [检测](#检测)
- [分割](#分割)
- [准备 ImageNet 数据集](#%E5%87%86%E5%A4%87-imagenet-%E6%95%B0%E6%8D%AE%E9%9B%86)
- [准备 Places205 数据集](#%E5%87%86%E5%A4%87-places205-%E6%95%B0%E6%8D%AE%E9%9B%86)
- [准备 iNaturalist2018 数据集](#%E5%87%86%E5%A4%87-inaturalist2018-%E6%95%B0%E6%8D%AE%E9%9B%86)
- [准备 PASCAL VOC 数据集](#%E5%87%86%E5%A4%87-pascal-voc-%E6%95%B0%E6%8D%AE%E9%9B%86)
- [准备 CIFAR10 数据集](#%E5%87%86%E5%A4%87-cifar10-%E6%95%B0%E6%8D%AE%E9%9B%86)
- [准备检测和分割数据集](#%E5%87%86%E5%A4%87%E6%A3%80%E6%B5%8B%E5%92%8C%E5%88%86%E5%89%B2%E6%95%B0%E6%8D%AE%E9%9B%86)
- [检测](#%E6%A3%80%E6%B5%8B)
- [分割](#%E5%88%86%E5%89%B2)
```
mmselfsup
@ -43,8 +43,8 @@ mmselfsup
1. 注册账号并登录 [下载页面](http://www.image-net.org/download-images)
2. 找到 ILSVRC2012 的下载链接,下载以下两个文件
- ILSVRC2012_img_train.tar (~138GB)
- ILSVRC2012_img_val.tar (~6.3GB)
- ILSVRC2012_img_train.tar (~138GB)
- ILSVRC2012_img_val.tar (~6.3GB)
3. 解压下载的文件
4. 使用这个 [脚本](https://github.com/BVLC/caffe/blob/master/data/ilsvrc12/get_ilsvrc_aux.sh) 下载元数据

View File

@ -4,21 +4,21 @@ MMSelfSup 主要使用python文件作为配置。我们设计的配置文件系
<!-- TOC -->
- [教程 0: 学习配置](#教程-0-学习配置)
- [配置文件与检查点命名约定](#配置文件与检查点命名约定)
- [算法信息](#算法信息)
- [模块信息](#模块信息)
- [训练信息](#训练信息)
- [数据信息](#数据信息)
- [配置文件命名示例](#配置文件命名示例)
- [检查点命名约定](#检查点命名约定)
- [配置文件结构](#配置文件结构)
- [继承和修改配置文件](#继承和修改配置文件)
- [使用配置中的中间变量](#使用配置中的中间变量)
- [忽略基础配置中的字段](#忽略基础配置中的字段)
- [使用基础配置中的字段](#使用基础配置中的字段)
- [通过脚本参数修改配置](#通过脚本参数修改配置)
- [导入用户定义模块](#导入用户定义模块)
- [教程 0: 学习配置](#%E6%95%99%E7%A8%8B-0-%E5%AD%A6%E4%B9%A0%E9%85%8D%E7%BD%AE)
- [配置文件与检查点命名约定](#%E9%85%8D%E7%BD%AE%E6%96%87%E4%BB%B6%E4%B8%8E%E6%A3%80%E6%9F%A5%E7%82%B9%E5%91%BD%E5%90%8D%E7%BA%A6%E5%AE%9A)
- [算法信息](#%E7%AE%97%E6%B3%95%E4%BF%A1%E6%81%AF)
- [模块信息](#%E6%A8%A1%E5%9D%97%E4%BF%A1%E6%81%AF)
- [训练信息](#%E8%AE%AD%E7%BB%83%E4%BF%A1%E6%81%AF)
- [数据信息](#%E6%95%B0%E6%8D%AE%E4%BF%A1%E6%81%AF)
- [配置文件命名示例](#%E9%85%8D%E7%BD%AE%E6%96%87%E4%BB%B6%E5%91%BD%E5%90%8D%E7%A4%BA%E4%BE%8B)
- [检查点命名约定](#%E6%A3%80%E6%9F%A5%E7%82%B9%E5%91%BD%E5%90%8D%E7%BA%A6%E5%AE%9A)
- [配置文件结构](#%E9%85%8D%E7%BD%AE%E6%96%87%E4%BB%B6%E7%BB%93%E6%9E%84)
- [继承和修改配置文件](#%E7%BB%A7%E6%89%BF%E5%92%8C%E4%BF%AE%E6%94%B9%E9%85%8D%E7%BD%AE%E6%96%87%E4%BB%B6)
- [使用配置中的中间变量](#%E4%BD%BF%E7%94%A8%E9%85%8D%E7%BD%AE%E4%B8%AD%E7%9A%84%E4%B8%AD%E9%97%B4%E5%8F%98%E9%87%8F)
- [忽略基础配置中的字段](#%E5%BF%BD%E7%95%A5%E5%9F%BA%E7%A1%80%E9%85%8D%E7%BD%AE%E4%B8%AD%E7%9A%84%E5%AD%97%E6%AE%B5)
- [使用基础配置中的字段](#%E4%BD%BF%E7%94%A8%E5%9F%BA%E7%A1%80%E9%85%8D%E7%BD%AE%E4%B8%AD%E7%9A%84%E5%AD%97%E6%AE%B5)
- [通过脚本参数修改配置](#%E9%80%9A%E8%BF%87%E8%84%9A%E6%9C%AC%E5%8F%82%E6%95%B0%E4%BF%AE%E6%94%B9%E9%85%8D%E7%BD%AE)
- [导入用户定义模块](#%E5%AF%BC%E5%85%A5%E7%94%A8%E6%88%B7%E5%AE%9A%E4%B9%89%E6%A8%A1%E5%9D%97)
<!-- TOC -->
@ -36,43 +36,53 @@ MMSelfSup 主要使用python文件作为配置。我们设计的配置文件系
- `data info`:数据信息:数据集名字,输入大小等,例如 imagenetcifar 等。
### 算法信息
```
{algorithm}-{misc}
```
`Algorithm` 表示论文中的算法缩写和版本。例如:
- `relative-loc`:不同单词之间使用破折线连接 `'-'`
- `simclr`
- `mocov2`
`misc` 提供一些其他算法相关信息。例如:
- `npid-ensure-neg`
- `deepcluster-sobel`
### 模块信息
```
{backbone setting}-{neck setting}-{head_setting}
```
模块信息主要包含 backboe 信息。例如:
- `resnet50`
- `vit`将会用在mocov3中
或者其他一些需要在配置名字中强调的特殊的设置。例如:
- `resnet50-nofrz`:在一些下游任务的训练中,该 backbone 不会冻结 stages
### 训练信息
训练相关的配置,包括 batch size, lr schedule, data augment 等。
- Batch size格式是 `{gpu x batch_per_gpu}` ,例如 `8xb32`
- Training recipe该方法以如下顺序组织`{pipeline aug}-{train aug}-{loss trick}-{scheduler}-{epochs}`
例如:
- `8xb32-mcrop-2-6-coslr-200e``mcrop` 是 SwAV 提出的 pipeline 中的名为 multi-crop 的一部分。2 和 6 表示 2 个 pipeline 分别输出 2 个和 6 个裁剪图,而且裁剪信息记录在数据信息中;
- `8xb32-accum16-coslr-200e``accum16` 表示权重会在梯度累积16个迭代之后更新。
### 数据信息
数据信息包含数据集,输入大小等。例如:
- `in1k``ImageNet1k` 数据集,默认使用的输入图像大小是 224x224
- `in1k-384px`表示输入图像大小是384x384
- `cifar10`
@ -80,17 +90,19 @@ MMSelfSup 主要使用python文件作为配置。我们设计的配置文件系
- `places205`
### 配置文件命名示例
```
swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96.py
```
- `swav`:算法信息
- `resnet50`:模块信息
- `8xb32-mcrop-2-6-coslr-200e`:训练信息
- `8xb32`:共使用 8 张 GPU每张 GPU 上的 batch size 是 32
- `mcrop-2-6`:使用 multi-crop 数据增强方法
- `coslr`:使用余弦学习率调度器
- `200e`训练模型200个周期
- `in1k-224-96`:数据信息,在 ImageNet1k 数据集上训练,输入大小是 224x224 和 96x96
- `swav`:算法信息
- `resnet50`:模块信息
- `8xb32-mcrop-2-6-coslr-200e`:训练信息
- `8xb32`:共使用 8 张 GPU每张 GPU 上的 batch size 是 32
- `mcrop-2-6`:使用 multi-crop 数据增强方法
- `coslr`:使用余弦学习率调度器
- `200e`训练模型200个周期
- `in1k-224-96`:数据信息,在 ImageNet1k 数据集上训练,输入大小是 224x224 和 96x96
### 检查点命名约定
@ -114,6 +126,7 @@ swav_resnet50_8xb32-mcrop-2-6-coslr-200e_in1k-224-96.py
为了易于理解,我们使用 MoCo v2 作为一个例子,并对它的每一行做出注释。若想了解更多细节,请参考 API 文档。
配置文件 `configs/selfsup/mocov2/mocov2_resnet50_8xb32-coslr-200e_in1k.py` 如下所述。
```python
_base_ = [
'../_base_/models/mocov2.py', # 模型
@ -134,6 +147,7 @@ checkpoint_config = dict(interval=10, max_keep_ckpts=3)
```
`../_base_/models/mocov2.py` 是 MoCo v2 的基础模型配置。
```python
model = dict(
type='MoCo', # 算法名字
@ -158,6 +172,7 @@ model = dict(
```
`../_base_/datasets/imagenet_mocov2.py` 是 MoCo v2 的基础数据集配置。
```python
# 数据集配置
data_source = 'ImageNet' # 数据源名字
@ -210,6 +225,7 @@ data = dict(
```
`../_base_/schedules/sgd_coslr-200e_in1k.py` 是 MoCo v2 的基础调度配置。
```python
# 优化器
optimizer = dict(
@ -232,7 +248,9 @@ runner = dict(
max_epochs=200) # 运行工作流周期总数的 Runner 的 max_epochs对于IterBasedRunner 使用 `max_iters`
```
`../_base_/default_runtime.py` 是运行时的默认配置。
```python
# 保存检查点
checkpoint_config = dict(interval=10) # 保存间隔是10
@ -300,7 +318,6 @@ data = dict(
))
```
### 忽略基础配置中的字段
有时候,你需要设置 `_delete_=True` 来忽略基础配置文件中一些域的内容。 你可以参考 [mmcv](https://mmcv.readthedocs.io/zh_CN/latest/understand_mmcv/config.html#inherit-from-base-config-with-ignored-fields) 获得更多说明。
@ -319,6 +336,7 @@ model = dict(
out_channels=128,
with_avg_pool=True))
```
### 使用基础配置中的字段
有时候,你可能引用 `_base_` 配置中一些字段,以避免重复定义。你可以参考[mmcv](https://mmcv.readthedocs.io/zh_CN/latest/understand_mmcv/config.html#reference-variables-from-base) 获取更多的说明。
@ -372,8 +390,7 @@ checkpoint_config = dict(interval=10, max_keep_ckpts=3)
- 更新 list/tuples 中的值
如果想要更新的值是一个列表或者元组,例如:配置文件通常设置 `workflow=[('train', 1)]`。如果你想要改变这个键,你可以指定 `--cfg-options workflow="[(train,1),(val,1)]"`。注意:对于 list/tuple 数据类型,引号\" 是必须的,并且在指定值的时候,在引号中 **NO** 空白字符。
如果想要更新的值是一个列表或者元组,例如:配置文件通常设置 `workflow=[('train', 1)]`。如果你想要改变这个键,你可以指定 `--cfg-options workflow="[(train,1),(val,1)]"`。注意:对于 list/tuple 数据类型,引号" 是必须的,并且在指定值的时候,在引号中 **NO** 空白字符。
## 导入用户定义模块
@ -381,7 +398,7 @@ checkpoint_config = dict(interval=10, max_keep_ckpts=3)
这部分内容初学者可以跳过,只在使用其他 MM-codebase 时会用到,例如使用 mmcls 作为第三方库来构建你的工程。
```
你可能使用其他的 MM-codebase 来完成你的工程,并在工程中创建新的数据集类,模型类,数据增强类等。为了简化代码,你可以使用 MM-codebase 作为第三方库,只需要保存你自己额外的代码,并在配置文件中导入自定义模块。你可以参考 [OpenMMLab Algorithm Competition Project](https://github.com/zhangrui-wolf/openmmlab-competition-2021) 中的例子。
你可能使用其他的 MM-codebase 来完成你的工程,并在工程中创建新的数据集类,模型类,数据增强类等。为了简化代码,你可以使用 MM-codebase 作为第三方库,只需要保存你自己额外的代码,并在配置文件中导入自定义模块。你可以参考 [OpenMMLab Algorithm Competition Project](https://github.com/zhangrui-wolf/openmmlab-competition-2021) 中的例子。
在你自己的配置文件中添加如下所述的代码:

View File

@ -2,11 +2,11 @@
在本节教程中,我们将介绍创建自定义数据格式的基本步骤:
- [教程 1: 添加新的数据格式](#教程-1-添加新的数据格式)
- [自定义数据格式示例](#自定义数据格式示例)
- [创建`DataSource`子类](#创建-datasource-子类)
- [创建`Dataset`子类](#创建-dataset-子类)
- [修改配置文件](#修改配置文件)
- [教程 1: 添加新的数据格式](#%E6%95%99%E7%A8%8B-1-%E6%B7%BB%E5%8A%A0%E6%96%B0%E7%9A%84%E6%95%B0%E6%8D%AE%E6%A0%BC%E5%BC%8F)
- [自定义数据格式示例](#%E8%87%AA%E5%AE%9A%E4%B9%89%E6%95%B0%E6%8D%AE%E6%A0%BC%E5%BC%8F%E7%A4%BA%E4%BE%8B)
- [创建`DataSource`子类](#%E5%88%9B%E5%BB%BA-datasource-%E5%AD%90%E7%B1%BB)
- [创建`Dataset`子类](#%E5%88%9B%E5%BB%BA-dataset-%E5%AD%90%E7%B1%BB)
- [修改配置文件](#%E4%BF%AE%E6%94%B9%E9%85%8D%E7%BD%AE%E6%96%87%E4%BB%B6)
如果你的算法不需要任何定制的数据格式,你可以使用[datasets](../../mmselfsup/datasets)目录中这些现成的数据格式。但是要使用这些现有的数据格式,你必须将你的数据集转换为现有的数据格式。

View File

@ -1,8 +1,8 @@
# 教程 2自定义数据管道
- [教程 2自定义数据管道](#教程-2-自定义数据管道)
- [`Pipeline` 概览](#Pipeline-概览)
- [在 `Pipeline` 中创建新的数据增强](#在-Pipeline-中创建新的数据增强)
- [教程 2自定义数据管道](#%E6%95%99%E7%A8%8B-2-%E8%87%AA%E5%AE%9A%E4%B9%89%E6%95%B0%E6%8D%AE%E7%AE%A1%E9%81%93)
- [`Pipeline` 概览](#Pipeline-%E6%A6%82%E8%A7%88)
- [在 `Pipeline` 中创建新的数据增强](#%E5%9C%A8-Pipeline-%E4%B8%AD%E5%88%9B%E5%BB%BA%E6%96%B0%E7%9A%84%E6%95%B0%E6%8D%AE%E5%A2%9E%E5%BC%BA)
## `Pipeline` 概览

View File

@ -1,10 +1,10 @@
# 教程 3添加新的模块
- [教程 3添加新的模块](#教程-3-添加新的模块)
- [添加新的 backbone](#添加新的-backbone)
- [添加新的 Necks](#添加新的-Necks)
- [添加新的损失](#添加新的损失)
- [合并所有改动](#合并所有改动)
- [教程 3添加新的模块](#%E6%95%99%E7%A8%8B-3-%E6%B7%BB%E5%8A%A0%E6%96%B0%E7%9A%84%E6%A8%A1%E5%9D%97)
- [添加新的 backbone](#%E6%B7%BB%E5%8A%A0%E6%96%B0%E7%9A%84-backbone)
- [添加新的 Necks](#%E6%B7%BB%E5%8A%A0%E6%96%B0%E7%9A%84-Necks)
- [添加新的损失](#%E6%B7%BB%E5%8A%A0%E6%96%B0%E7%9A%84%E6%8D%9F%E5%A4%B1)
- [合并所有改动](#%E5%90%88%E5%B9%B6%E6%89%80%E6%9C%89%E6%94%B9%E5%8A%A8)
在自监督学习领域,每个模型可以被分为以下四个部分:

View File

@ -1,16 +1,16 @@
# 教程 4自定义优化策略
- [教程 4自定义优化策略](#教程-4自定义优化策略)
- [构造 PyTorch 内置优化器](#构造-pytorch-内置优化器)
- [定制学习率调整策略](#定制学习率调整策略)
- [定制学习率衰减曲线](#定制学习率衰减曲线)
- [定制学习率预热策略](#定制学习率预热策略)
- [定制动量调整策略](#定制动量调整策略)
- [参数化精细配置](#参数化精细配置)
- [梯度裁剪与梯度累计](#梯度裁剪与梯度累计)
- [梯度裁剪](#梯度裁剪)
- [梯度累计](#梯度累计)
- [用户自定义优化方法](#用户自定义优化方法)
- [教程 4自定义优化策略](#%E6%95%99%E7%A8%8B-4%EF%BC%9A%E8%87%AA%E5%AE%9A%E4%B9%89%E4%BC%98%E5%8C%96%E7%AD%96%E7%95%A5)
- [构造 PyTorch 内置优化器](#%E6%9E%84%E9%80%A0-pytorch-%E5%86%85%E7%BD%AE%E4%BC%98%E5%8C%96%E5%99%A8)
- [定制学习率调整策略](#%E5%AE%9A%E5%88%B6%E5%AD%A6%E4%B9%A0%E7%8E%87%E8%B0%83%E6%95%B4%E7%AD%96%E7%95%A5)
- [定制学习率衰减曲线](#%E5%AE%9A%E5%88%B6%E5%AD%A6%E4%B9%A0%E7%8E%87%E8%A1%B0%E5%87%8F%E6%9B%B2%E7%BA%BF)
- [定制学习率预热策略](#%E5%AE%9A%E5%88%B6%E5%AD%A6%E4%B9%A0%E7%8E%87%E9%A2%84%E7%83%AD%E7%AD%96%E7%95%A5)
- [定制动量调整策略](#%E5%AE%9A%E5%88%B6%E5%8A%A8%E9%87%8F%E8%B0%83%E6%95%B4%E7%AD%96%E7%95%A5)
- [参数化精细配置](#%E5%8F%82%E6%95%B0%E5%8C%96%E7%B2%BE%E7%BB%86%E9%85%8D%E7%BD%AE)
- [梯度裁剪与梯度累计](#%E6%A2%AF%E5%BA%A6%E8%A3%81%E5%89%AA%E4%B8%8E%E6%A2%AF%E5%BA%A6%E7%B4%AF%E8%AE%A1)
- [梯度裁剪](#%E6%A2%AF%E5%BA%A6%E8%A3%81%E5%89%AA)
- [梯度累计](#%E6%A2%AF%E5%BA%A6%E7%B4%AF%E8%AE%A1)
- [用户自定义优化方法](#%E7%94%A8%E6%88%B7%E8%87%AA%E5%AE%9A%E4%B9%89%E4%BC%98%E5%8C%96%E6%96%B9%E6%B3%95)
在本教程中,我们将介绍如何在运行自定义模型时,进行构造优化器、定制学习率、动量调整策略、参数化精细配置、梯度裁剪、梯度累计以及用户自定义优化方法等。

View File

@ -1,17 +1,17 @@
# 教程 5自定义模型运行参数
- [教程 5自定义模型运行参数](#教程-5自定义模型运行参数)
- [定制工作流](#定制工作流)
- [钩子](#钩子)
- [默认训练钩子](#默认训练钩子)
- [权重文件钩子 CheckpointHook](#权重文件钩子-checkpointhook)
- [日志钩子 LoggerHooks](#日志钩子-loggerhooks)
- [验证钩子 EvalHook](#验证钩子-evalhook)
- [使用其他内置钩子](#使用其他内置钩子)
- [自定义钩子](#自定义钩子)
- [1. 创建一个新钩子](#1-创建一个新钩子)
- [2. 导入新钩子](#2-导入新钩子)
- [3. 修改配置](#3-修改配置)
- [教程 5自定义模型运行参数](#%E6%95%99%E7%A8%8B-5%EF%BC%9A%E8%87%AA%E5%AE%9A%E4%B9%89%E6%A8%A1%E5%9E%8B%E8%BF%90%E8%A1%8C%E5%8F%82%E6%95%B0)
- [定制工作流](#%E5%AE%9A%E5%88%B6%E5%B7%A5%E4%BD%9C%E6%B5%81)
- [钩子](#%E9%92%A9%E5%AD%90)
- [默认训练钩子](#%E9%BB%98%E8%AE%A4%E8%AE%AD%E7%BB%83%E9%92%A9%E5%AD%90)
- [权重文件钩子 CheckpointHook](#%E6%9D%83%E9%87%8D%E6%96%87%E4%BB%B6%E9%92%A9%E5%AD%90-checkpointhook)
- [日志钩子 LoggerHooks](#%E6%97%A5%E5%BF%97%E9%92%A9%E5%AD%90-loggerhooks)
- [验证钩子 EvalHook](#%E9%AA%8C%E8%AF%81%E9%92%A9%E5%AD%90-evalhook)
- [使用其他内置钩子](#%E4%BD%BF%E7%94%A8%E5%85%B6%E4%BB%96%E5%86%85%E7%BD%AE%E9%92%A9%E5%AD%90)
- [自定义钩子](#%E8%87%AA%E5%AE%9A%E4%B9%89%E9%92%A9%E5%AD%90)
- [1. 创建一个新钩子](#1-%E5%88%9B%E5%BB%BA%E4%B8%80%E4%B8%AA%E6%96%B0%E9%92%A9%E5%AD%90)
- [2. 导入新钩子](#2-%E5%AF%BC%E5%85%A5%E6%96%B0%E9%92%A9%E5%AD%90)
- [3. 修改配置](#3-%E4%BF%AE%E6%94%B9%E9%85%8D%E7%BD%AE)
在本教程中,我们将介绍如何在运行自定义模型时,进行自定义工作流和钩子的方法。
@ -144,7 +144,6 @@ evaluation = dict(interval=1, start=200, metric='accuracy', metric_options={'top
- [ProfilerHook](https://github.com/open-mmlab/mmcv/blob/master/mmcv/runner/hooks/profiler.py)
- ......
如果要用的钩子已经在MMCV中实现可以直接修改配置以使用该钩子如下格式
```python

View File

@ -2,24 +2,25 @@
在 MMSelfSup 中,我们提供了许多基准评测,因此模型可以在不同的下游任务中进行评估。这里提供了全面的教程和例子来解释如何用 MMSelfSup 运行所有的基准。
- [教程 6运行基准评测](#教程6运行基准评测)
- [分类](#分类)
- [教程 6运行基准评测](#%E6%95%99%E7%A8%8B6%EF%BC%9A%E8%BF%90%E8%A1%8C%E5%9F%BA%E5%87%86%E8%AF%84%E6%B5%8B)
- [分类](#%E5%88%86%E7%B1%BB)
- [VOC SVM / Low-shot SVM](#voc-svm--low-shot-svm)
- [线性评估](#线性评估)
- [ImageNet半监督分类](#imagenet半监督分类)
- [ImageNet最邻近分类](#imagenet最邻近分类)
- [检测](#检测)
- [分割](#分割)
- [线性评估](#%E7%BA%BF%E6%80%A7%E8%AF%84%E4%BC%B0)
- [ImageNet半监督分类](#imagenet%E5%8D%8A%E7%9B%91%E7%9D%A3%E5%88%86%E7%B1%BB)
- [ImageNet最邻近分类](#imagenet%E6%9C%80%E9%82%BB%E8%BF%91%E5%88%86%E7%B1%BB)
- [检测](#%E6%A3%80%E6%B5%8B)
- [分割](#%E5%88%86%E5%89%B2)
首先,你应该通过`tools/model_converters/extract_backbone_weights.py`提取你的 backbone 权重。
```shell
python ./tools/model_converters/extract_backbone_weights.py {CHECKPOINT} {MODEL_FILE}
```
参数:
- `CHECKPOINT`selfsup 方法的权重文件,名称为 epoch_*.pth 。
- `MODEL_FILE`:输出的 backbone 权重文件。如果没有指定,下面的 `PRETRAIN` 会使用这个提取的模型文件。
- `CHECKPOINT`selfsup 方法的权重文件,名称为 epoch\_\*.pth 。
- `MODEL_FILE`:输出的 backbone 权重文件。如果没有指定,下面的 `PRETRAIN` 会使用这个提取的模型文件。
## 分类
@ -31,7 +32,6 @@ python ./tools/model_converters/extract_backbone_weights.py {CHECKPOINT} {MODEL_
为了评估预训练的模型,你可以运行以下命令。
```shell
# 分布式版本
bash tools/benchmarks/classification/svm_voc07/dist_test_svm_pretrain.sh ${SELFSUP_CONFIG} ${GPUS} ${PRETRAIN} ${FEATURE_LIST}
@ -50,9 +50,10 @@ bash tools/benchmarks/classification/svm_voc07/dist_test_svm_epoch.sh ${SELFSUP_
bash tools/benchmarks/classification/svm_voc07/slurm_test_svm_epoch.sh ${PARTITION} ${JOB_NAME} ${SELFSUP_CONFIG} ${EPOCH} ${FEATURE_LIST}
```
**用ckpt测试时代码使用epoch_*.pth文件不需要提取权重。**
**用ckpt测试时代码使用epoch\_\*.pth文件不需要提取权重。**
备注:
- `${SELFSUP_CONFIG}`是自监督实验的配置文件。
- `${FEATURE_LIST}`是一个字符串,指定 layer1 到 layer5 的特征用于评估;例如,如果你只想评估 layer5 ,那么 `FEATURE_LIST` 是 "feat5",如果你想评估所有的特征,那么`FEATURE_LIST`是 "feat1 feat2 feat3 feat4 feat5"(用空格分隔)。如果留空,默认`FEATURE_LIST`为 "feat5"。
- `PRETRAIN`:预训练的模型文件。
@ -63,7 +64,6 @@ bash tools/benchmarks/classification/svm_voc07/slurm_test_svm_epoch.sh ${PARTITI
线性评估是最通用的基准评测之一,我们整合了几篇论文的配置设置,也包括多头线性评估。我们在自己的代码库中为多头功能编写分类模型,因此,为了运行线性评估,我们仍然使用 `.sh` 脚本来启动训练。支持的数据集是**ImageNet**、**Places205**和**iNaturalist18**。
```shell
# 分布式版本
bash tools/benchmarks/classification/dist_train_linear.sh ${CONFIG} ${PRETRAIN}
@ -73,6 +73,7 @@ bash tools/benchmarks/classification/slurm_train_linear.sh ${PARTITION} ${JOB_NA
```
备注:
- 默认的 GPU 数量是 8当改变 GPUS 时,也请相应改变配置文件中的 `samples_per_gpu` ,以确保总 batch size 为256。
- `CONFIG`: 使用 `configs/benchmarks/classification/` 下的配置文件。具体有`imagenet`(不包括 `imagenet_*percent` 文件夹), `places205` 和`inaturalist2018`。
- `PRETRAIN`:预训练的模型文件。
@ -90,6 +91,7 @@ bash tools/benchmarks/classification/slurm_train_semi.sh ${PARTITION} ${JOB_NAME
```
备注:
- 默认的 GPU 数量是4。
- `CONFIG`: 使用 `configs/benchmarks/classification/imagenet/` 下的配置文件,名为 `imagenet_*percent` 文件夹。
- `PRETRAIN`:预训练的模型文件。
@ -116,15 +118,17 @@ bash tools/benchmarks/classification/knn_imagenet/dist_test_knn_epoch.sh ${SELFS
bash tools/benchmarks/classification/knn_imagenet/slurm_test_knn_epoch.sh ${PARTITION} ${JOB_NAME} ${SELFSUP_CONFIG} ${EPOCH}
```
**用ckpt测试时代码使用epoch_*.pth文件不需要提取权重。**
**用ckpt测试时代码使用epoch\_\*.pth文件不需要提取权重。**
备注:
- `${SELFSUP_CONFIG}`是自监督实验的配置文件。
- `PRETRAIN`:预训练的模型文件。
- 如果你想改变GPU的数量你可以在命令的开头加上`GPUS_PER_NODE=4 GPUS=4`。
- `EPOCH`是你要测试的 ckpt 的 epoch 数。
## 检测
在这里,我们倾向于使用 MMDetection 来完成检测任务。首先,确保你已经安装了[MIM](https://github.com/open-mmlab/mim)它也是OpenMMLab的一个项目。
```shell
@ -146,13 +150,13 @@ bash tools/benchmarks/mmdetection/mim_slurm_train.sh ${PARTITION} ${CONFIG} ${PR
```
备注:
- `CONFIG`:使用 `configs/benchmarks/mmdetection/` 下的配置文件或编写你自己的配置文件。
- `PRETRAIN`: 预训练的模型文件。
或者如果你想用[detectron2](https://github.com/facebookresearch/detectron2)做检测任务,我们也提供一些配置文件。
请参考[INSTALL.md](https://github.com/facebookresearch/detectron2/blob/main/INSTALL.md)进行安装,并按照[目录结构](https://github.com/facebookresearch/detectron2/tree/main/datasets)来准备 detectron2 所需的数据集。
````shell
conda activate detectron2 # 在这里使用 detectron2 环境,否则使用 open-mmlab 环境
cd benchmarks/detection
@ -184,3 +188,4 @@ bash tools/benchmarks/mmsegmentation/mim_slurm_train.sh ${PARTITION} ${CONFIG} $
备注:
- `CONFIG`:使用 `configs/benchmarks/mmsegmentation/` 下的配置文件或编写自己的配置文件。
- `PRETRAIN`:预训练的模型文件。
````

View File

@ -1,5 +1,5 @@
# Copyright (c) OpenMMLab. All rights reserved.
from .cosineAnnealing_hook import StepFixCosineAnnealingLrUpdaterHook
from .cosine_annealing_hook import StepFixCosineAnnealingLrUpdaterHook
from .deepcluster_hook import DeepClusterHook
from .densecl_hook import DenseCLHook
from .momentum_update_hook import MomentumUpdateHook

View File

@ -61,29 +61,30 @@ class MultiheadAttention(_MultiheadAttention):
proj_bias=proj_bias,
init_cfg=init_cfg)
del self.out_drop
self.qkv = nn.Linear(self.input_dims, embed_dims * 3, bias=False)
if qkv_bias:
self.q_bias = nn.Parameter(torch.zeros(embed_dims))
self.v_bias = nn.Parameter(torch.zeros(embed_dims))
else:
self.q_bias = None
self.k_bias = None
self.v_bias = None
self.qkv_bias = qkv_bias
if not self.qkv_bias:
self._init_qv_bias()
self.qkv = nn.Linear(
self.input_dims, embed_dims * 3, bias=self.qkv_bias)
def _init_qv_bias(self):
self.q_bias = nn.Parameter(torch.zeros(self.embed_dims))
self.v_bias = nn.Parameter(torch.zeros(self.embed_dims))
def forward(self, x: torch.Tensor) -> torch.Tensor:
# qkv bias is different from that in mmcls
qkv_bias = None
if self.q_bias is not None:
qkv_bias = torch.cat(
(self.q_bias,
torch.zeros_like(self.v_bias,
requires_grad=False), self.v_bias))
B, N, _ = x.shape
qkv = F.linear(
x, weight=self.qkv.weight,
bias=qkv_bias).reshape(B, N, 3, self.num_heads,
self.head_dims).permute(2, 0, 3, 1, 4)
if not self.qkv_bias:
k_bias = torch.zeros_like(self.v_bias, requires_grad=False)
qkv_bias = torch.cat((self.q_bias, k_bias, self.v_bias))
qkv = F.linear(x, weight=self.qkv.weight, bias=qkv_bias)
else:
qkv = self.qkv(x)
qkv = qkv.reshape(B, N, 3, self.num_heads, -1).permute(2, 0, 3, 1, 4)
q, k, v = qkv[0], qkv[1], qkv[2]
attn = (q @ k.transpose(-2, -1)) * self.scale

View File

@ -1,6 +1,6 @@
# Copyright (c) Open-MMLab. All rights reserved.
__version__ = '0.9.0'
__version__ = '0.9.1'
def parse_version_info(version_str):

View File

@ -13,6 +13,5 @@ no_lines_before = STDLIB,LOCALFOLDER
default_section = THIRDPARTY
[codespell]
skip = *.ipynb
quiet-level = 3
ignore-words-list = patten,confectionary,nd,ty,formating
ignore-words-list = nd,ty,ba

View File

@ -7,7 +7,8 @@ import torch
from mmselfsup.models.algorithms import CAE
# model settings
backbone = dict(type='CAEViT', arch='b', patch_size=16, init_values=0.1)
backbone = dict(
type='CAEViT', arch='b', patch_size=16, init_values=0.1, qkv_bias=False)
neck = dict(
type='CAENeck',
patch_size=16,

View File

@ -0,0 +1,38 @@
# Copyright (c) OpenMMLab. All rights reserved.
from unittest.mock import MagicMock
from mmselfsup.core.hooks import StepFixCosineAnnealingLrUpdaterHook
def test_cosine_annealing_hook():
lr_config = dict(
min_lr=1e-5,
warmup='linear',
warmup_iters=10,
warmup_ratio=1e-4,
warmup_by_epoch=True,
by_epoch=False)
lr_annealing_hook = StepFixCosineAnnealingLrUpdaterHook(**lr_config)
lr_annealing_hook.regular_lr = [1.0]
lr_annealing_hook.warmup_iters = 10
# test get_warmup_lr
lr = lr_annealing_hook.get_warmup_lr(1)
assert isinstance(lr, list)
# test get_lr
# by_epoch = False
runner = MagicMock()
runner.iter = 10
runner.max_iters = 1000
lr = lr_annealing_hook.get_lr(runner, 1.5)
assert isinstance(lr, float)
# by_epoch = True
lr_annealing_hook.by_epoch = True
runner.epoch = 10
runner.max_epochs = 10
runner.data_loader = MagicMock()
runner.data_loader.__len__ = MagicMock(return_value=10)
lr = lr_annealing_hook.get_lr(runner, 1.5)
assert isinstance(lr, float)