mmselfsup/docs/en/algorithms/mae.md

50 lines
3.4 KiB
Markdown
Raw Normal View History

Bump version to v0.8.0 (#269) * [Fix]: Fix mmcls upgrade bug (#235) * [Feature]: Add multi machine dist_train (#232) * [Feature]: Add multi machine dist_train * [Fix]: Change bash to sh * [Fix]: Fix missing sh suffix * [Refactor]: Change bash to sh * [Refactor] Add unit test (#234) * [Refactor] add unit test * update workflow * update * [Fix] fix lint * update test * refactor moco and densecl unit test * fix lint * add unit test * update unit test * remove modification * [Feature]: Add MAE metafile (#238) * [Feature]: Add MAE metafile * [Fix]: Fix lint * [Fix]: Change LARS to AdamW in the metafile of MAE * [Fix] fix codecov bug (#241) * [Fix] fix codecov bug * update comment * [Refactor] Using MMCls backbones (#233) * [Refactor] using backbones from MMCls * [Refactor] modify the unit test * [Fix] modify default setting of out_indices * [Docs] fix lint * [Refactor] modify super init * [Refactore] remove res_layer.py * using mmcv PatchEmbed * [Fix]: Fix outdated problem (#249) * [Fix]: Fix outdated problem * [Fix]: Update MoCov3 bibtex * [Fix]: Use abs path in README * [Fix]: Reformat MAE bibtex * [Fix]: Reformat MoCov3 bibtex * [Feature] Resume from the latest checkpoint automatically. (#245) * [Feature] Resume from the latest checkpoint automatically. * fix windows path problem * fix lint * add code reference * [Docs] add docstring for ResNet and ResNeXt (#252) * [Feature] support KNN benchmark (#243) * [Feature] support KNN benchmark * [Fix] add docstring and multi-machine testing * [Fix] fix lint * [Fix] change args format and check init_cfg * [Docs] add benchmark tutorial * [Docs] add benchmark results * [Feature]: SimMIM supported (#239) * [Feature]: SimMIM Pretrain * [Feature]: Add mix precision and 16x128 config * [Fix]: Fix config import bug * [Fix]: Fix config bug * [Feature]: Simim Finetune * [Fix]: Log every 100 * [Fix]: Fix eval problem * [Feature]: Add docstring for simmim * [Refactor]: Merge layer wise lr decay to Default constructor * [Fix]:Fix simmim evaluation bug * [Fix]: Change model to be compatible to latest version of mmcls * [Fix]: Fix lint * [Fix]: Rewrite forward_train for classification cls * [Feature]: Add UT * [Fix]: Fix lint * [Feature]: Add 32 gpus training for simmim ft * [Fix]: Rename mmcls classifier wrapper * [Fix]: Add docstring to SimMIMNeck * [Feature]: Generate docstring for the forward function of simmim encoder * [Fix]: Rewrite the class docstring for constructor * [Fix]: Fix lint * [Fix]: Fix UT * [Fix]: Reformat config * [Fix]: Add img resolution * [Feature]: Add readme and metafile * [Fix]: Fix typo in README.md * [Fix]: Change BlackMaskGen to BlockwiseMaskGenerator * [Fix]: Change the name of SwinForSimMIM * [Fix]: Delete irrelevant files * [Feature]: Create extra transformerfinetuneconstructor * [Fix]: Fix lint * [Fix]: Update SimMIM README * [Fix]: Change SimMIMPretrainHead to SimMIMHead * [Fix]: Fix the docstring of ft constructor * [Fix]: Fix UT * [Fix]: Recover deletion Co-authored-by: Your <you@example.com> * [Fix] add seed to distributed sampler (#250) * [Fix] add seed to distributed sampler * fix lint * [Feature] Add ImageNet21k (#225) * solve memory leak by limited implementation * fix lint problem Co-authored-by: liming <liming.ai@bytedance.com> * [Refactor] change args format to '--a-b' (#253) * [Refactor] change args format to `--a-b` * modify tsne script * modify 'sh' files * modify getting_started.md * modify getting_started.md * [Fix] fix 'mkdir' error in prepare_voc07_cls.sh (#261) * [Fix] fix positional parameter error (#260) * [Fix] fix command errors in benchmarks tutorial (#263) * [Docs] add brief installation steps in README.md (#265) * [Docs] add colab tutorial (#247) * [Docs] add colab tutorial * fix lint * modify the colab tutorial, using API to train the model * modify the description * remove # * modify the command * [Docs] translate 6_benchmarks.md into Chinese (#262) * [Docs] translate 6_benchmarks.md into Chinese * Update 6_benchmarks.md change 基准 to 基准评测 * Update 6_benchmarks.md (1) Add Chinese translation of ‘1 folder for ImageNet nearest-neighbor classification task’ (2) 数据预准备 -> 数据准备 * [Docs] remove install scripts in README (#267) * [Docs] Update version information in dev branch (#268) * update version to v0.8.0 * fix lint * [Fix]: Install the latest mmcls * [Fix]: Add SimMIM in RAEDME Co-authored-by: Yuan Liu <30762564+YuanLiuuuuuu@users.noreply.github.com> Co-authored-by: Jiahao Xie <52497952+Jiahao000@users.noreply.github.com> Co-authored-by: Your <you@example.com> Co-authored-by: Ming Li <73068772+mitming@users.noreply.github.com> Co-authored-by: liming <liming.ai@bytedance.com> Co-authored-by: RenQin <45731309+soonera@users.noreply.github.com> Co-authored-by: YuanLiuuuuuu <3463423099@qq.com>
2022-03-31 18:47:54 +08:00
# MAE
> [Masked Autoencoders Are Scalable Vision Learners](https://arxiv.org/abs/2111.06377)
<!-- [ALGORITHM] -->
## Abstract
This paper shows that masked autoencoders (MAE) are
scalable self-supervised learners for computer vision. Our
MAE approach is simple: we mask random patches of the
input image and reconstruct the missing pixels. It is based
on two core designs. First, we develop an asymmetric
encoder-decoder architecture, with an encoder that operates only on the
visible subset of patches (without mask tokens), along with a lightweight
decoder that reconstructs the original image from the latent representation
and mask tokens. Second, we find that masking a high proportion
of the input image, e.g., 75%, yields a nontrivial and
meaningful self-supervisory task. Coupling these two designs enables us to
train large models efficiently and effectively: we accelerate
training (by 3× or more) and improve accuracy. Our scalable approach allows
for learning high-capacity models that generalize well: e.g., a vanilla
ViT-Huge model achieves the best accuracy (87.8%) among
methods that use only ImageNet-1K data. Transfer performance in downstream tasks outperforms supervised pretraining and shows promising scaling behavior.
<div align="center">
<img src="https://user-images.githubusercontent.com/30762564/150733959-2959852a-c7bd-4d3f-911f-3e8d8839fe67.png" width="40%"/>
</div>
## Models and Benchmarks
Here, we report the results of the model, which is pre-trained on ImageNet1K
for 400 epochs, the details are below:
Bump version to v0.9.1 (#322) * [Fix]: Set qkv bias to False for cae and True for mae (#303) * [Fix]: Add mmcls transformer layer choice * [Fix]: Fix transformer encoder layer bug * [Fix]: Change UT of cae * [Feature]: Change the file name of cosine annealing hook (#304) * [Feature]: Change cosine annealing hook file name * [Feature]: Add UT for cosine annealing hook * [Fix]: Fix lint * read tutorials and fix typo (#308) * [Fix] fix config errors in MAE (#307) * update readthedocs algorithm readme (#310) * [Docs] Replace markdownlint with mdformat (#311) * Replace markdownlint with mdformat to avoid installing ruby * fix typo * add 'ba' to codespell ignore-words-list * Configure Myst-parser to parse anchor tag (#309) * [Docs] rewrite install.md (#317) * rewrite the install.md * add faq.md * fix lint * add FAQ to README * add Chinese version * fix typo * fix format * remove modification * fix format * [Docs] refine README.md file (#318) * refine README.md file * fix lint * format language button * rename getting_started.md * revise index.rst * add model_zoo.md to index.rst * fix lint * refine readme Co-authored-by: Jiahao Xie <52497952+Jiahao000@users.noreply.github.com> * [Enhance] update byol models and results (#319) * Update version information (#321) Co-authored-by: Yuan Liu <30762564+YuanLiuuuuuu@users.noreply.github.com> Co-authored-by: Yi Lu <21515006@zju.edu.cn> Co-authored-by: RenQin <45731309+soonera@users.noreply.github.com> Co-authored-by: Jiahao Xie <52497952+Jiahao000@users.noreply.github.com>
2022-06-01 09:59:05 +08:00
| Backbone | Pre-train epoch | Fine-tuning Top-1 | Pre-train Config | Fine-tuning Config | Download |
| :------: | :-------------: | :---------------: | :-----------------------------------------------------------------------------------------------------------------------: | :---------------------------------------------------------------------------------------------------------------------------------------------: | :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
| ViT-B/16 | 400 | 83.1 | [config](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mae/mae_vit-b-p16_8xb512-coslr-400e_in1k.py) | [config](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/vit-b-p16_ft-8xb128-coslr-100e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/mae/mae_vit-base-p16_8xb512-coslr-400e_in1k-224_20220223-85be947b.pth) \| [log](https://download.openmmlab.com/mmselfsup/mae/mae_vit-base-p16_8xb512-coslr-300e_in1k-224_20220210_140925.log.json) |
Bump version to v0.8.0 (#269) * [Fix]: Fix mmcls upgrade bug (#235) * [Feature]: Add multi machine dist_train (#232) * [Feature]: Add multi machine dist_train * [Fix]: Change bash to sh * [Fix]: Fix missing sh suffix * [Refactor]: Change bash to sh * [Refactor] Add unit test (#234) * [Refactor] add unit test * update workflow * update * [Fix] fix lint * update test * refactor moco and densecl unit test * fix lint * add unit test * update unit test * remove modification * [Feature]: Add MAE metafile (#238) * [Feature]: Add MAE metafile * [Fix]: Fix lint * [Fix]: Change LARS to AdamW in the metafile of MAE * [Fix] fix codecov bug (#241) * [Fix] fix codecov bug * update comment * [Refactor] Using MMCls backbones (#233) * [Refactor] using backbones from MMCls * [Refactor] modify the unit test * [Fix] modify default setting of out_indices * [Docs] fix lint * [Refactor] modify super init * [Refactore] remove res_layer.py * using mmcv PatchEmbed * [Fix]: Fix outdated problem (#249) * [Fix]: Fix outdated problem * [Fix]: Update MoCov3 bibtex * [Fix]: Use abs path in README * [Fix]: Reformat MAE bibtex * [Fix]: Reformat MoCov3 bibtex * [Feature] Resume from the latest checkpoint automatically. (#245) * [Feature] Resume from the latest checkpoint automatically. * fix windows path problem * fix lint * add code reference * [Docs] add docstring for ResNet and ResNeXt (#252) * [Feature] support KNN benchmark (#243) * [Feature] support KNN benchmark * [Fix] add docstring and multi-machine testing * [Fix] fix lint * [Fix] change args format and check init_cfg * [Docs] add benchmark tutorial * [Docs] add benchmark results * [Feature]: SimMIM supported (#239) * [Feature]: SimMIM Pretrain * [Feature]: Add mix precision and 16x128 config * [Fix]: Fix config import bug * [Fix]: Fix config bug * [Feature]: Simim Finetune * [Fix]: Log every 100 * [Fix]: Fix eval problem * [Feature]: Add docstring for simmim * [Refactor]: Merge layer wise lr decay to Default constructor * [Fix]:Fix simmim evaluation bug * [Fix]: Change model to be compatible to latest version of mmcls * [Fix]: Fix lint * [Fix]: Rewrite forward_train for classification cls * [Feature]: Add UT * [Fix]: Fix lint * [Feature]: Add 32 gpus training for simmim ft * [Fix]: Rename mmcls classifier wrapper * [Fix]: Add docstring to SimMIMNeck * [Feature]: Generate docstring for the forward function of simmim encoder * [Fix]: Rewrite the class docstring for constructor * [Fix]: Fix lint * [Fix]: Fix UT * [Fix]: Reformat config * [Fix]: Add img resolution * [Feature]: Add readme and metafile * [Fix]: Fix typo in README.md * [Fix]: Change BlackMaskGen to BlockwiseMaskGenerator * [Fix]: Change the name of SwinForSimMIM * [Fix]: Delete irrelevant files * [Feature]: Create extra transformerfinetuneconstructor * [Fix]: Fix lint * [Fix]: Update SimMIM README * [Fix]: Change SimMIMPretrainHead to SimMIMHead * [Fix]: Fix the docstring of ft constructor * [Fix]: Fix UT * [Fix]: Recover deletion Co-authored-by: Your <you@example.com> * [Fix] add seed to distributed sampler (#250) * [Fix] add seed to distributed sampler * fix lint * [Feature] Add ImageNet21k (#225) * solve memory leak by limited implementation * fix lint problem Co-authored-by: liming <liming.ai@bytedance.com> * [Refactor] change args format to '--a-b' (#253) * [Refactor] change args format to `--a-b` * modify tsne script * modify 'sh' files * modify getting_started.md * modify getting_started.md * [Fix] fix 'mkdir' error in prepare_voc07_cls.sh (#261) * [Fix] fix positional parameter error (#260) * [Fix] fix command errors in benchmarks tutorial (#263) * [Docs] add brief installation steps in README.md (#265) * [Docs] add colab tutorial (#247) * [Docs] add colab tutorial * fix lint * modify the colab tutorial, using API to train the model * modify the description * remove # * modify the command * [Docs] translate 6_benchmarks.md into Chinese (#262) * [Docs] translate 6_benchmarks.md into Chinese * Update 6_benchmarks.md change 基准 to 基准评测 * Update 6_benchmarks.md (1) Add Chinese translation of ‘1 folder for ImageNet nearest-neighbor classification task’ (2) 数据预准备 -> 数据准备 * [Docs] remove install scripts in README (#267) * [Docs] Update version information in dev branch (#268) * update version to v0.8.0 * fix lint * [Fix]: Install the latest mmcls * [Fix]: Add SimMIM in RAEDME Co-authored-by: Yuan Liu <30762564+YuanLiuuuuuu@users.noreply.github.com> Co-authored-by: Jiahao Xie <52497952+Jiahao000@users.noreply.github.com> Co-authored-by: Your <you@example.com> Co-authored-by: Ming Li <73068772+mitming@users.noreply.github.com> Co-authored-by: liming <liming.ai@bytedance.com> Co-authored-by: RenQin <45731309+soonera@users.noreply.github.com> Co-authored-by: YuanLiuuuuuu <3463423099@qq.com>
2022-03-31 18:47:54 +08:00
## Citation
```bibtex
@article{He2021MaskedAA,
title={Masked Autoencoders Are Scalable Vision Learners},
author={Kaiming He and Xinlei Chen and Saining Xie and Yanghao Li and
Piotr Doll'ar and Ross B. Girshick},
journal={ArXiv},
year={2021}
}
```