mmocr/configs/textrecog/master/_base_master_resnet31.py

file_client_args = dict(backend='disk')

dictionary = dict(
    type='Dictionary',
    dict_file='{{ fileDirname }}/../../../dicts/english_digits_symbols.txt',
    with_padding=True,
    with_unknown=True,
    same_start_end=True,
    with_start=True,
    with_end=True)

model = dict(
    type='MASTER',
    backbone=dict(
        type='ResNet',
        in_channels=3,
        stem_channels=[64, 128],
        block_cfgs=dict(
            type='BasicBlock',
            plugins=dict(
                cfg=dict(
                    type='GCAModule',
                    ratio=0.0625,
                    n_head=1,
                    pooling_type='att',
                    is_att_scale=False,
                    fusion_type='channel_add'),
                position='after_conv2')),
        arch_layers=[1, 2, 5, 3],
        arch_channels=[256, 256, 512, 512],
        strides=[1, 1, 1, 1],
        plugins=[
            dict(
                cfg=dict(type='Maxpool2d', kernel_size=2, stride=(2, 2)),
                stages=(True, True, False, False),
                position='before_stage'),
            dict(
                cfg=dict(type='Maxpool2d', kernel_size=(2, 1), stride=(2, 1)),
                stages=(False, False, True, False),
                position='before_stage'),
            dict(
                cfg=dict(
                    type='ConvModule',
                    kernel_size=3,
                    stride=1,
                    padding=1,
                    norm_cfg=dict(type='BN'),
                    act_cfg=dict(type='ReLU')),
                stages=(True, True, True, True),
                position='after_stage')
        ],
        init_cfg=[
            dict(type='Kaiming', layer='Conv2d'),
            dict(type='Constant', val=1, layer='BatchNorm2d'),
        ]),
    encoder=None,
    decoder=dict(
        type='MasterDecoder',
        d_model=512,
        n_head=8,
        attn_drop=0.,
        ffn_drop=0.,
        d_inner=2048,
        n_layers=3,
        feat_pe_drop=0.2,
        feat_size=6 * 40,
        postprocessor=dict(type='AttentionPostprocessor'),
        module_loss=dict(
            type='CEModuleLoss', reduction='mean', ignore_first_char=True),
        max_seq_len=30,
        dictionary=dictionary),
    data_preprocessor=dict(
        type='TextRecogDataPreprocessor',
        mean=[127.5, 127.5, 127.5],
        std=[127.5, 127.5, 127.5]))

train_pipeline = [
    dict(
        type='LoadImageFromFile',
        file_client_args=file_client_args,
        ignore_empty=True,
        min_size=2),
    dict(type='LoadOCRAnnotations', with_text=True),
    dict(
        type='RescaleToHeight',
        height=48,
        min_width=48,
        max_width=160,
        width_divisor=16),
    dict(type='PadToWidth', width=160),
    dict(
        type='PackTextRecogInputs',
        meta_keys=('img_path', 'ori_shape', 'img_shape', 'valid_ratio'))
]

test_pipeline = [
    dict(type='LoadImageFromFile', file_client_args=file_client_args),
    dict(
        type='RescaleToHeight',
        height=48,
        min_width=48,
        max_width=160,
        width_divisor=16),
    dict(type='PadToWidth', width=160),
    # add loading annotation after ``Resize`` because ground truth
    # does not need to do resize data transform
    dict(type='LoadOCRAnnotations', with_text=True),
    dict(
        type='PackTextRecogInputs',
        meta_keys=('img_path', 'ori_shape', 'img_shape', 'valid_ratio'))
]
[Config] Update MASTER config (#1301) * [Config] Add textrec_default_runtime * add vis hook * update master config * update metafile * update Co-authored-by: gaotongxiao <gaotongxiao@gmail.com> 2022-08-22 14:30:44 +08:00			`file_client_args = dict(backend='disk')`

[MASTER] Add MASTER config 2022-06-20 06:20:46 +00:00			`dictionary = dict(`
			`type='Dictionary',`
[Config] dict related path to config (#1329) 2022-08-25 16:14:10 +08:00			`dict_file='{{ fileDirname }}/../../../dicts/english_digits_symbols.txt',`
[MASTER] Add MASTER config 2022-06-20 06:20:46 +00:00			`with_padding=True,`
			`with_unknown=True,`
			`same_start_end=True,`
			`with_start=True,`
			`with_end=True)`
[Model] Add MASTER (#807) * fix #794: add MASTER * fix conflict add MASTER * fix conflict add MASTER * fix conflict add MASTER * fix conflict add MASTER * fix conflict add MASTER * fix conflict add MASTER * fix conflict add MASTER * Fix linting * after git rebase main * after git rebase main * fix conflict add MASTER * fix conflict add MASTER * after git rebase main * fix conflict add MASTER * fix conflict add MASTER * fix conflict add MASTER * after git rebase main * add GCAModule to plugins * coexist master and master_old * fix merge mmocr 0.5.0 conflict * fix lint error * update * [fix] remove remains in __init__ * [update] update code in review * update readme for master * Add docstr to MasterDecoder, refined MasterDecoder, remove MASTERLoss * Unify the output length of MasterDecoder in train and test mode; add test for it, remove MasterLoss * update readme * update * update metafile,README,demo/README,config,ocr.py * Update mmocr/utils/ocr.py * update Co-authored-by: gaotongxiao <gaotongxiao@gmail.com> Co-authored-by: Mountchicken <mountchicken@outlook.com> 2022-05-05 16:06:15 +08:00
			`model = dict(`
			`type='MASTER',`
			`backbone=dict(`
			`type='ResNet',`
			`in_channels=3,`
			`stem_channels=[64, 128],`
			`block_cfgs=dict(`
			`type='BasicBlock',`
			`plugins=dict(`
			`cfg=dict(`
			`type='GCAModule',`
			`ratio=0.0625,`
[Fix] Fix a typo problem in MASTER (#1031) * loss * fix * 'fix' 2022-05-23 23:33:13 +08:00			`n_head=1,`
[Model] Add MASTER (#807) * fix #794: add MASTER * fix conflict add MASTER * fix conflict add MASTER * fix conflict add MASTER * fix conflict add MASTER * fix conflict add MASTER * fix conflict add MASTER * fix conflict add MASTER * Fix linting * after git rebase main * after git rebase main * fix conflict add MASTER * fix conflict add MASTER * after git rebase main * fix conflict add MASTER * fix conflict add MASTER * fix conflict add MASTER * after git rebase main * add GCAModule to plugins * coexist master and master_old * fix merge mmocr 0.5.0 conflict * fix lint error * update * [fix] remove remains in __init__ * [update] update code in review * update readme for master * Add docstr to MasterDecoder, refined MasterDecoder, remove MASTERLoss * Unify the output length of MasterDecoder in train and test mode; add test for it, remove MasterLoss * update readme * update * update metafile,README,demo/README,config,ocr.py * Update mmocr/utils/ocr.py * update Co-authored-by: gaotongxiao <gaotongxiao@gmail.com> Co-authored-by: Mountchicken <mountchicken@outlook.com> 2022-05-05 16:06:15 +08:00			`pooling_type='att',`
			`is_att_scale=False,`
			`fusion_type='channel_add'),`
			`position='after_conv2')),`
			`arch_layers=[1, 2, 5, 3],`
			`arch_channels=[256, 256, 512, 512],`
			`strides=[1, 1, 1, 1],`
			`plugins=[`
			`dict(`
			`cfg=dict(type='Maxpool2d', kernel_size=2, stride=(2, 2)),`
			`stages=(True, True, False, False),`
			`position='before_stage'),`
			`dict(`
			`cfg=dict(type='Maxpool2d', kernel_size=(2, 1), stride=(2, 1)),`
			`stages=(False, False, True, False),`
			`position='before_stage'),`
			`dict(`
			`cfg=dict(`
			`type='ConvModule',`
			`kernel_size=3,`
			`stride=1,`
			`padding=1,`
			`norm_cfg=dict(type='BN'),`
			`act_cfg=dict(type='ReLU')),`
			`stages=(True, True, True, True),`
			`position='after_stage')`
			`],`
			`init_cfg=[`
			`dict(type='Kaiming', layer='Conv2d'),`
			`dict(type='Constant', val=1, layer='BatchNorm2d'),`
			`]),`
			`encoder=None,`
			`decoder=dict(`
			`type='MasterDecoder',`
			`d_model=512,`
			`n_head=8,`
			`attn_drop=0.,`
			`ffn_drop=0.,`
			`d_inner=2048,`
			`n_layers=3,`
			`feat_pe_drop=0.2,`
[MASTER] Add MASTER config 2022-06-20 06:20:46 +00:00			`feat_size=6 * 40,`
			`postprocessor=dict(type='AttentionPostprocessor'),`
[TODO] Replace loss_module with module_loss 2022-07-14 06:14:52 +00:00			`module_loss=dict(`
			`type='CEModuleLoss', reduction='mean', ignore_first_char=True),`
Update max_seq_len 2022-07-13 07:00:22 +00:00			`max_seq_len=30,`
			`dictionary=dictionary),`
[Recognizer] refactor baserecognizer 2022-06-30 09:40:12 +00:00			`data_preprocessor=dict(`
			`type='TextRecogDataPreprocessor',`
			`mean=[127.5, 127.5, 127.5],`
			`std=[127.5, 127.5, 127.5]))`
[Config] Update MASTER config (#1301) * [Config] Add textrec_default_runtime * add vis hook * update master config * update metafile * update Co-authored-by: gaotongxiao <gaotongxiao@gmail.com> 2022-08-22 14:30:44 +08:00
			`train_pipeline = [`
			`dict(`
			`type='LoadImageFromFile',`
			`file_client_args=file_client_args,`
			`ignore_empty=True,`
[Config] Update rec configs (#1417) 2022-10-09 12:43:45 +08:00			`min_size=2),`
[Config] Update MASTER config (#1301) * [Config] Add textrec_default_runtime * add vis hook * update master config * update metafile * update Co-authored-by: gaotongxiao <gaotongxiao@gmail.com> 2022-08-22 14:30:44 +08:00			`dict(type='LoadOCRAnnotations', with_text=True),`
			`dict(`
			`type='RescaleToHeight',`
			`height=48,`
			`min_width=48,`
			`max_width=160,`
			`width_divisor=16),`
			`dict(type='PadToWidth', width=160),`
			`dict(`
			`type='PackTextRecogInputs',`
			`meta_keys=('img_path', 'ori_shape', 'img_shape', 'valid_ratio'))`
			`]`

			`test_pipeline = [`
			`dict(type='LoadImageFromFile', file_client_args=file_client_args),`
			`dict(`
			`type='RescaleToHeight',`
			`height=48,`
			`min_width=48,`
			`max_width=160,`
			`width_divisor=16),`
			`dict(type='PadToWidth', width=160),`
			# add loading annotation after ``Resize`` because ground truth
			`# does not need to do resize data transform`
			`dict(type='LoadOCRAnnotations', with_text=True),`
			`dict(`
			`type='PackTextRecogInputs',`
			`meta_keys=('img_path', 'ori_shape', 'img_shape', 'valid_ratio'))`
			`]`