* [Squash] Refator ViT (from #295)
* Use base variable to simplify auto_aug setting
* Use common PatchEmbed, remove HybridEmbed and refactor ViT init
structure.
* Add `output_cls_token` option and change the output format of ViT and
input format of ViT head.
* Update unit tests and add test for `output_cls_token`.
* Support out_indices.
* Standardize config files
* Support resize position embedding.
* Add readme file of vit
* Rename config file
* Improve docs about ViT.
* Update docstring
* Use local version `MultiheadAttention` instead of mmcv version.
* Fix MultiheadAttention
* Support `qk_scale` argument in `MultiheadAttention`
* Improve docs and change `layer_cfg` to `layer_cfgs` and support
sequence.
* Use init_cfg to init Linear layer in VisionTransformerHead
* update metafile
* Update checkpoints and configs
* Imporve docstring.
* Update README
* Revert GAP modification.
* Add swin transformer archs S, B and L.
* Add SwinTransformer configs
* Add train config files of swin.
* Align init method with original code
* Use nn.Unfold to merge patch
* Change all ConfigDict to dict
* Add init_cfg for all subclasses of BaseModule.
* Use mmcv version init function
* Add Swin README
* Use safer cfg copy method
* Improve docstring and variable name.
* Fix some difference in randaug
Fix BGR bug, align scheduler config.
Fix label smoothing parameter difference.
* Fix missing droppath in attn
* Fix bug of relative posititon table if window width is not equal to
height.
* Make `PatchMerging` more general, support kernel, stride, padding and
dilation.
* Rename `residual` to `identity` in attention and FFN.
* Add `auto_pad` option to auto pad feature map
* Improve docstring.
* Fix bug in ShiftWMSA padding.
* Remove unused `key` and `value` in ShiftWMSA
* Move `PatchMerging` into utils and use common `PatchEmbed`.
* Use latest `LinearClsHead`, train augments and label smooth settings.
And remove original `SwinLinearClsHead`.
* Mark some configs as "Evalution Only".
* Remove useless comment in config
* 1. Move ShiftWindowMSA and WindowMSA to `utils/attention.py`
2. Add docstrings of each module.
3. Fix some variables' names.
4. Other small improvement.
* Add unit tests of swin-transformer and patchmerging.
* Fix some bugs in unit tests.
* Fix bug of rel_position_index if window is not square.
* Make WindowMSA implicit, and add unit tests.
* Add metafile.yml, update readme and model_zoo.