mmpretrain/configs/vision_transformer/vit-large-p32_pt-64xb64_in1k-224.py

_base_ = [
    '../_base_/models/vit-large-p32.py',
    '../_base_/datasets/imagenet_bs64_pil_resize_autoaug.py',
    '../_base_/schedules/imagenet_bs4096_AdamW.py',
    '../_base_/default_runtime.py'
]

model = dict(
    head=dict(hidden_dim=3072),
    train_cfg=dict(
        augments=dict(type='BatchMixup', alpha=0.2, num_classes=1000,
                      prob=1.)))
[Refactor] Refator ViT (Continue #295) (#395) * [Squash] Refator ViT (from #295) * Use base variable to simplify auto_aug setting * Use common PatchEmbed, remove HybridEmbed and refactor ViT init structure. * Add `output_cls_token` option and change the output format of ViT and input format of ViT head. * Update unit tests and add test for `output_cls_token`. * Support out_indices. * Standardize config files * Support resize position embedding. * Add readme file of vit * Rename config file * Improve docs about ViT. * Update docstring * Use local version `MultiheadAttention` instead of mmcv version. * Fix MultiheadAttention * Support `qk_scale` argument in `MultiheadAttention` * Improve docs and change `layer_cfg` to `layer_cfgs` and support sequence. * Use init_cfg to init Linear layer in VisionTransformerHead * update metafile * Update checkpoints and configs * Imporve docstring. * Update README * Revert GAP modification. 2021-10-18 16:07:00 +08:00			`_base_ = [`
			`'../_base_/models/vit-large-p32.py',`
			`'../_base_/datasets/imagenet_bs64_pil_resize_autoaug.py',`
			`'../_base_/schedules/imagenet_bs4096_AdamW.py',`
			`'../_base_/default_runtime.py'`
			`]`

			`model = dict(`
			`head=dict(hidden_dim=3072),`
			`train_cfg=dict(`
			`augments=dict(type='BatchMixup', alpha=0.2, num_classes=1000,`
			`prob=1.)))`