History

Ma Zerun 076ee10cac [Feature] Add swin-transformer model. (#271 ) * Add swin transformer archs S, B and L. * Add SwinTransformer configs * Add train config files of swin. * Align init method with original code * Use nn.Unfold to merge patch * Change all ConfigDict to dict * Add init_cfg for all subclasses of BaseModule. * Use mmcv version init function * Add Swin README * Use safer cfg copy method * Improve docstring and variable name. * Fix some difference in randaug Fix BGR bug, align scheduler config. Fix label smoothing parameter difference. * Fix missing droppath in attn * Fix bug of relative posititon table if window width is not equal to height. * Make `PatchMerging` more general, support kernel, stride, padding and dilation. * Rename `residual` to `identity` in attention and FFN. * Add `auto_pad` option to auto pad feature map * Improve docstring. * Fix bug in ShiftWMSA padding. * Remove unused `key` and `value` in ShiftWMSA * Move `PatchMerging` into utils and use common `PatchEmbed`. * Use latest `LinearClsHead`, train augments and label smooth settings. And remove original `SwinLinearClsHead`. * Mark some configs as "Evalution Only". * Remove useless comment in config * 1. Move ShiftWindowMSA and WindowMSA to `utils/attention.py` 2. Add docstrings of each module. 3. Fix some variables' names. 4. Other small improvement. * Add unit tests of swin-transformer and patchmerging. * Fix some bugs in unit tests. * Fix bug of rel_position_index if window is not square. * Make WindowMSA implicit, and add unit tests. * Add metafile.yml, update readme and model_zoo.		2021-07-01 09:30:42 +08:00
..
README.md	[Feature] Add swin-transformer model. (#271 )	2021-07-01 09:30:42 +08:00
metafile.yml	[Feature] Add swin-transformer model. (#271 )	2021-07-01 09:30:42 +08:00
swin_base_224_imagenet.py	[Feature] Add swin-transformer model. (#271 )	2021-07-01 09:30:42 +08:00
swin_base_384_imagenet.py	[Feature] Add swin-transformer model. (#271 )	2021-07-01 09:30:42 +08:00
swin_large_224_imagenet.py	[Feature] Add swin-transformer model. (#271 )	2021-07-01 09:30:42 +08:00
swin_large_384_imagenet.py	[Feature] Add swin-transformer model. (#271 )	2021-07-01 09:30:42 +08:00
swin_small_224_imagenet.py	[Feature] Add swin-transformer model. (#271 )	2021-07-01 09:30:42 +08:00
swin_tiny_224_imagenet.py	[Feature] Add swin-transformer model. (#271 )	2021-07-01 09:30:42 +08:00

README.md

Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

Introduction

[ALGORITHM]

@article{liu2021Swin,
  title={Swin Transformer: Hierarchical Vision Transformer using Shifted Windows},
  author={Liu, Ze and Lin, Yutong and Cao, Yue and Hu, Han and Wei, Yixuan and Zhang, Zheng and Lin, Stephen and Guo, Baining},
  journal={arXiv preprint arXiv:2103.14030},
  year={2021}
}

Pretrain model

The pre-trained modles are converted from model zoo of Swin Transformer.

ImageNet 1k

Model	Pretrain	resolution	Params(M)	Flops(G)	Top-1 (%)	Top-5 (%)	Download
Swin-T	ImageNet-1k	224x224	28.29	4.36	81.18	95.52	model
Swin-S	ImageNet-1k	224x224	49.61	8.52	83.21	96.25	model
Swin-B	ImageNet-1k	224x224	87.77	15.14	83.42	96.44	model
Swin-B	ImageNet-1k	384x384	87.90	44.49	84.49	96.95	model
Swin-B	ImageNet-22k	224x224	87.77	15.14	85.16	97.50	model
Swin-B	ImageNet-22k	384x384	87.90	44.49	86.44	98.05	model
Swin-L	ImageNet-22k	224x224	196.53	34.04	86.24	97.88	model
Swin-L	ImageNet-22k	384x384	196.74	100.04	87.25	98.25	model

Results and models

ImageNet

Model	Pretrain	resolution	Params(M)	Flops(G)	Top-1 (%)	Top-5 (%)	Config	Download
Swin-T	ImageNet-1k	224x224	28.29	4.36	81.18	95.61	config	model \| log
Swin-S	ImageNet-1k	224x224	49.61	8.52	83.02	96.29	config	model \| log
Swin-B	ImageNet-1k	224x224	87.77	15.14	83.36	96.44	config	model \| log