mmpretrain/configs/_base_/schedules/imagenet_bs2048_AdamW.py

# optimizer
# In ClassyVision, the lr is set to 0.003 for bs4096.
# In this implementation(bs2048), lr = 0.003 / 4096 * (32bs * 64gpus) = 0.0015
optimizer = dict(type='AdamW', lr=0.0015, weight_decay=0.3)

# specific to vit pretrain
paramwise_cfg = dict(
    custom_keys={
        '.backbone.cls_token': dict(decay_mult=0.0),
        '.backbone.pos_embed': dict(decay_mult=0.0)
    })
# learning policy
param_scheduler = [
    dict(
        type='LinearLR',
        start_factor=1e-3,
        by_epoch=False,
        begin=0,
        end=10 * 626),
    dict(
        type='CosineAnnealingLR',
        T_max=290,
        eta_min=1e-2,
        by_epoch=True,
        begin=10,
        end=300)
]
# old learning policy
# lr_config = dict(
#     policy='CosineAnnealing',
#     min_lr=0,
#     warmup='linear',
#     warmup_iters=10000,
#     warmup_ratio=1e-4)

# train, val, test setting
train_cfg = dict(by_epoch=True, max_epochs=300)
val_cfg = dict(interval=1)  # validate every epoch
test_cfg = dict()
[Feature]Add Vit (#214) * add imagenet bs 4096 * add vit_base_patch16_224_finetune * add vit_base_patch16_224_pretrain * add vit_base_patch16_384_finetune * add vit_base_patch16_384_finetune * add vit_b_p16_224_finetune_imagenet * add vit_b_p16_224_pretrain_imagenet * add vit_b_p16_384_finetune_imagenet * add vit * add vit * add vit head * vit unitest * keep up with ClsHead * test vit * add flag to determiine whether to calculate acc during training * Changes related to mmcv1.3.0 * change checkpoint saving interval to 10 * add label smooth * default_runtime.py recovery * docformatter * docformatter * delete 2 lines of comments * delete configs/_base_/schedules/imagenet_bs4096.py * add configs/_base_/schedules/imagenet_bs2048_AdamW.py * rename imagenet_bs4096.py to imagenet_bs2048_AdamW.py * add helpers.py * test vit hybrid backbone * fix HybridEmbed * use to_2tuple instead 2021-04-16 19:22:41 +08:00			`# optimizer`
			`# In ClassyVision, the lr is set to 0.003 for bs4096.`
			`# In this implementation(bs2048), lr = 0.003 / 4096 * (32bs * 64gpus) = 0.0015`
			`optimizer = dict(type='AdamW', lr=0.0015, weight_decay=0.3)`
[Bug]Fix weight decay (#227) * add imagenet bs 4096 * add vit_base_patch16_224_finetune * add vit_base_patch16_224_pretrain * add vit_base_patch16_384_finetune * add vit_base_patch16_384_finetune * add vit_b_p16_224_finetune_imagenet * add vit_b_p16_224_pretrain_imagenet * add vit_b_p16_384_finetune_imagenet * add vit * add vit * add vit head * vit unitest * keep up with ClsHead * test vit * add flag to determiine whether to calculate acc during training * Changes related to mmcv1.3.0 * change checkpoint saving interval to 10 * add label smooth * default_runtime.py recovery * docformatter * docformatter * delete 2 lines of comments * delete configs/_base_/schedules/imagenet_bs4096.py * add configs/_base_/schedules/imagenet_bs2048_AdamW.py * rename imagenet_bs4096.py to imagenet_bs2048_AdamW.py * add AutoAugment * fix weight decay in vit * change eval interval to 10 * add mytrain.py for test * test before layers * test attr in layers * test classifier * delete mytrain.py * delete @torch.jit.ignore * change eval interval back to 1 * add some comments to imagenet_bs2048_AdamW * add some comments 2021-04-28 17:16:43 +08:00
			`# specific to vit pretrain`
			`paramwise_cfg = dict(`
			`custom_keys={`
			`'.backbone.cls_token': dict(decay_mult=0.0),`
			`'.backbone.pos_embed': dict(decay_mult=0.0)`
			`})`
[Feature]Add Vit (#214) * add imagenet bs 4096 * add vit_base_patch16_224_finetune * add vit_base_patch16_224_pretrain * add vit_base_patch16_384_finetune * add vit_base_patch16_384_finetune * add vit_b_p16_224_finetune_imagenet * add vit_b_p16_224_pretrain_imagenet * add vit_b_p16_384_finetune_imagenet * add vit * add vit * add vit head * vit unitest * keep up with ClsHead * test vit * add flag to determiine whether to calculate acc during training * Changes related to mmcv1.3.0 * change checkpoint saving interval to 10 * add label smooth * default_runtime.py recovery * docformatter * docformatter * delete 2 lines of comments * delete configs/_base_/schedules/imagenet_bs4096.py * add configs/_base_/schedules/imagenet_bs2048_AdamW.py * rename imagenet_bs4096.py to imagenet_bs2048_AdamW.py * add helpers.py * test vit hybrid backbone * fix HybridEmbed * use to_2tuple instead 2021-04-16 19:22:41 +08:00			`# learning policy`
Refactor scheduler configuration 2022-05-23 17:31:57 +08:00			`param_scheduler = [`
			`dict(`
			`type='LinearLR',`
			`start_factor=1e-3,`
			`by_epoch=False,`
			`begin=0,`
			`end=10 * 626),`
			`dict(`
			`type='CosineAnnealingLR',`
			`T_max=290,`
			`eta_min=1e-2,`
			`by_epoch=True,`
			`begin=10,`
			`end=300)`
			`]`
			`# old learning policy`
			`# lr_config = dict(`
			`# policy='CosineAnnealing',`
			`# min_lr=0,`
			`# warmup='linear',`
			`# warmup_iters=10000,`
			`# warmup_ratio=1e-4)`

			`# train, val, test setting`
			`train_cfg = dict(by_epoch=True, max_epochs=300)`
			`val_cfg = dict(interval=1) # validate every epoch`
			`test_cfg = dict()`