* [Squash] Refator ViT (from #295) * Use base variable to simplify auto_aug setting * Use common PatchEmbed, remove HybridEmbed and refactor ViT init structure. * Add `output_cls_token` option and change the output format of ViT and input format of ViT head. * Update unit tests and add test for `output_cls_token`. * Support out_indices. * Standardize config files * Support resize position embedding. * Add readme file of vit * Rename config file * Improve docs about ViT. * Update docstring * Use local version `MultiheadAttention` instead of mmcv version. * Fix MultiheadAttention * Support `qk_scale` argument in `MultiheadAttention` * Improve docs and change `layer_cfg` to `layer_cfgs` and support sequence. * Use init_cfg to init Linear layer in VisionTransformerHead * update metafile * Update checkpoints and configs * Imporve docstring. * Update README * Revert GAP modification. |
||
---|---|---|
.. | ||
augment | ||
__init__.py | ||
attention.py | ||
channel_shuffle.py | ||
embed.py | ||
helpers.py | ||
inverted_residual.py | ||
make_divisible.py | ||
se_layer.py |