mmpretrain/configs/deit/deit-tiny_4xb256_in1k.py

# In small and tiny arch, remove drop path and EMA hook comparing with the
# original config
_base_ = [
    '../_base_/datasets/imagenet_bs64_swin_224.py',
    '../_base_/schedules/imagenet_bs1024_adamw_swin.py',
    '../_base_/default_runtime.py'
]

# model settings
model = dict(
    type='ImageClassifier',
    backbone=dict(
        type='VisionTransformer',
        arch='deit-tiny',
        img_size=224,
        patch_size=16),
    neck=None,
    head=dict(
        type='VisionTransformerClsHead',
        num_classes=1000,
        in_channels=192,
        loss=dict(
            type='LabelSmoothLoss', label_smooth_val=0.1, mode='original'),
    ),
    init_cfg=[
        dict(type='TruncNormal', layer='Linear', std=.02),
        dict(type='Constant', layer='LayerNorm', val=1., bias=0.),
    ],
    train_cfg=dict(augments=[
        dict(type='Mixup', alpha=0.8),
        dict(type='CutMix', alpha=1.0)
    ]),
)

# data settings
train_dataloader = dict(batch_size=256)

# schedule settings
optim_wrapper = dict(
    paramwise_cfg=dict(
        norm_decay_mult=0.0,
        bias_decay_mult=0.0,
        custom_keys={
            '.cls_token': dict(decay_mult=0.0),
            '.pos_embed': dict(decay_mult=0.0)
        }),
    clip_grad=dict(max_norm=5.0),
)
[Reproduction] Reproduce training results of DeiT. (#711) * Update deit training settings * Update decay config * Add mixup&cutmix and drop path rate * Update training configs * Update model-zoo * Add comments 2022-03-02 14:23:10 +08:00			`# In small and tiny arch, remove drop path and EMA hook comparing with the`
			`# original config`
[Feature] Add DeiT backbone and checkpoints. (#576) * Support DeiT backbone. * Use hook to automatically resize pos embed * Update ViT training setting * Add deit configs and update docs * Fix vit arch assertion * Remove useless init function * Add unit tests. * Fix resize_pos_embed for DeiT * Improve according to comments. 2021-12-15 22:44:57 +08:00			`_base_ = [`
[Reproduction] Reproduce training results of DeiT. (#711) * Update deit training settings * Update decay config * Add mixup&cutmix and drop path rate * Update training configs * Update model-zoo * Add comments 2022-03-02 14:23:10 +08:00			`'../_base_/datasets/imagenet_bs64_swin_224.py',`
			`'../_base_/schedules/imagenet_bs1024_adamw_swin.py',`
[Feature] Add DeiT backbone and checkpoints. (#576) * Support DeiT backbone. * Use hook to automatically resize pos embed * Update ViT training setting * Add deit configs and update docs * Fix vit arch assertion * Remove useless init function * Add unit tests. * Fix resize_pos_embed for DeiT * Improve according to comments. 2021-12-15 22:44:57 +08:00			`'../_base_/default_runtime.py'`
			`]`

			`# model settings`
			`model = dict(`
			`type='ImageClassifier',`
			`backbone=dict(`
			`type='VisionTransformer',`
[Refactor] Refactor configs and metafile (#1369) * update base datasets * update base * update barlowtwins * update with new convention * update * update * update * add schedule * add densecl * add eva * add mae * add maskfeat * add milan and mixmim * add moco * add swav simclr * add simmim and simsiam * refine * update * add to model index * update config inheritance * fix error in metafile * Update pre-commit and metafile check script * update metafile * fix name error * Fix classification model name and config name --------- Co-authored-by: mzr1996 <mzr1996@163.com> 2023-02-23 11:17:16 +08:00			`arch='deit-tiny',`
[Feature] Add DeiT backbone and checkpoints. (#576) * Support DeiT backbone. * Use hook to automatically resize pos embed * Update ViT training setting * Add deit configs and update docs * Fix vit arch assertion * Remove useless init function * Add unit tests. * Fix resize_pos_embed for DeiT * Improve according to comments. 2021-12-15 22:44:57 +08:00			`img_size=224,`
			`patch_size=16),`
			`neck=None,`
			`head=dict(`
			`type='VisionTransformerClsHead',`
			`num_classes=1000,`
[Refactor] Refactor configs and metafile (#1369) * update base datasets * update base * update barlowtwins * update with new convention * update * update * update * add schedule * add densecl * add eva * add mae * add maskfeat * add milan and mixmim * add moco * add swav simclr * add simmim and simsiam * refine * update * add to model index * update config inheritance * fix error in metafile * Update pre-commit and metafile check script * update metafile * fix name error * Fix classification model name and config name --------- Co-authored-by: mzr1996 <mzr1996@163.com> 2023-02-23 11:17:16 +08:00			`in_channels=192,`
[Feature] Add DeiT backbone and checkpoints. (#576) * Support DeiT backbone. * Use hook to automatically resize pos embed * Update ViT training setting * Add deit configs and update docs * Fix vit arch assertion * Remove useless init function * Add unit tests. * Fix resize_pos_embed for DeiT * Improve according to comments. 2021-12-15 22:44:57 +08:00			`loss=dict(`
			`type='LabelSmoothLoss', label_smooth_val=0.1, mode='original'),`
			`),`
			`init_cfg=[`
			`dict(type='TruncNormal', layer='Linear', std=.02),`
			`dict(type='Constant', layer='LayerNorm', val=1., bias=0.),`
[Reproduction] Reproduce training results of DeiT. (#711) * Update deit training settings * Update decay config * Add mixup&cutmix and drop path rate * Update training configs * Update model-zoo * Add comments 2022-03-02 14:23:10 +08:00			`],`
			`train_cfg=dict(augments=[`
[Improve] Speed up data preprocessor. (#1064) * [Improve] Speed up data preprocessor. * Add ClsDataSample serialization override functions. * Add unit tests * Modify configs to fit new mixup args. * Fix `num_classes` of the ImageNet-21k config. * Update docs. 2022-10-17 17:08:18 +08:00			`dict(type='Mixup', alpha=0.8),`
			`dict(type='CutMix', alpha=1.0)`
[Refactor] Refactor batch augmentations 2022-06-01 15:29:30 +08:00			`]),`
			`)`
[Feature] Add DeiT backbone and checkpoints. (#576) * Support DeiT backbone. * Use hook to automatically resize pos embed * Update ViT training setting * Add deit configs and update docs * Fix vit arch assertion * Remove useless init function * Add unit tests. * Fix resize_pos_embed for DeiT * Improve according to comments. 2021-12-15 22:44:57 +08:00
			`# data settings`
[Refactor] Add `ResizeEdge` and refactor all dataset configs. 2022-06-01 14:11:53 +08:00			`train_dataloader = dict(batch_size=256)`
[Reproduction] Reproduce training results of DeiT. (#711) * Update deit training settings * Update decay config * Add mixup&cutmix and drop path rate * Update training configs * Update model-zoo * Add comments 2022-03-02 14:23:10 +08:00
[Refactor] Add `ResizeEdge` and refactor all dataset configs. 2022-06-01 14:11:53 +08:00			`# schedule settings`
[Refactor] Update optimizer related registries and configs. 2022-06-02 17:11:09 +08:00			`optim_wrapper = dict(`
			`paramwise_cfg=dict(`
			`norm_decay_mult=0.0,`
			`bias_decay_mult=0.0,`
			`custom_keys={`
			`'.cls_token': dict(decay_mult=0.0),`
			`'.pos_embed': dict(decay_mult=0.0)`
			`}),`
			`clip_grad=dict(max_norm=5.0),`
			`)`