There are 4 basic component types under `config/_base_`, dataset, model, schedule, default_runtime.
Many methods could be easily constructed with one of each like DeepLabV3, PSPNet.
The configs that are composed by components from `_base_` are called _primitive_.
For all configs under the same folder, it is recommended to have only **one**_primitive_ config. All other configs should inherit from the _primitive_ config. In this way, the maximum of inheritance level is 3.
For example, if some modification is made base on DeepLabV3, user may first inherit the basic DeepLabV3 structure by specifying `_base_ = ../deeplabv3/deeplabv3_r50_512x1024_40ki_cityscapes.py`, then modify the necessary fields in the config files.
If you are building an entirely new method that does not share the structure with any of the existing methods, you may create a folder `xxxnet` under `configs`,
norm_cfg=dict(type='SyncBN', requires_grad=True), # The configuration of norm layer.
align_corners=False, # The align_corners argument for resize in decoding.
loss_decode=dict( # Config of loss function for the decode_head.
type='CrossEntropyLoss', # Type of loss used for segmentation.
use_sigmoid=False, # Whether use sigmoid activation for segmentation.
loss_weight=0.4))) # Loss weight of auxiliary head, which is usually 0.4 of decode head.
train_cfg = dict() # train_cfg is just a place holder for now.
test_cfg = dict(mode='whole') # The test mode, options are 'whole' and 'sliding'. 'whole': whole image fully-convolutional test. 'sliding': sliding crop window on the image.
dataset_type = 'CityscapesDataset' # Dataset type, this will be used to define the dataset.
data_root = 'data/cityscapes/' # Root path of data.
img_norm_cfg = dict( # Image normalization config to normalize the input images.
mean=[123.675, 116.28, 103.53], # Mean values used to pre-training the pre-trained backbone models.
std=[58.395, 57.12, 57.375], # Standard variance used to pre-training the pre-trained backbone models.
to_rgb=True) # The channel orders of image used to pre-training the pre-trained backbone models.
crop_size = (512, 1024) # The crop size during training.
train_pipeline = [ # Training pipeline.
dict(type='LoadImageFromFile'), # First pipeline to load images from file path.
dict(type='LoadAnnotations'), # Second pipeline to load annotations for current image.
dict(type='Resize', # Augmentation pipeline that resize the images and their annotations.
img_scale=(2048, 1024), # The largest scale of image.
ratio_range=(0.5, 2.0)), # The augmented scale range as ratio.
dict(type='RandomCrop', # Augmentation pipeline that randomly crop a patch from current image.
crop_size=(512, 1024), # The crop size of patch.
cat_max_ratio=0.75), # The max area ratio that could be occupied by single category.
dict(
type='RandomFlip', # Augmentation pipeline that flip the images and their annotations
flip_ratio=0.5), # The ratio or probability to flip
dict(type='PhotoMetricDistortion'), # Augmentation pipeline that distort current image with several photo metric methods.
dict(
type='Normalize', # Augmentation pipeline that normalize the input images
mean=[123.675, 116.28, 103.53], # These keys are the same of img_norm_cfg since the
std=[58.395, 57.12, 57.375], # keys of img_norm_cfg are used here as arguments
to_rgb=True),
dict(type='Pad', # Augmentation pipeline that pad the image to specified size.
size=(512, 1024), # The output size of padding.
pad_val=0, # The padding value for image.
seg_pad_val=255), # The padding value of 'gt_semantic_seg'.
dict(type='DefaultFormatBundle'), # Default format bundle to gather data in the pipeline
dict(type='Collect', # Pipeline that decides which keys in the data should be passed to the segmentor
keys=['img', 'gt_semantic_seg'])
]
test_pipeline = [
dict(type='LoadImageFromFile'), # First pipeline to load images from file path
dict(
type='MultiScaleFlipAug', # An encapsulation that encapsulates the test time augmentations
img_scale=(2048, 1024), # Decides the largest scale for testing, used for the Resize pipeline
flip=False, # Whether to flip images during testing
# MMSegWandbHook is mmseg implementation of WandbLoggerHook. ClearMLLoggerHook, DvcliveLoggerHook, MlflowLoggerHook, NeptuneLoggerHook, PaviLoggerHook, SegmindLoggerHook are also supported based on MMCV implementation.
workflow = [('train', 1)] # Workflow for runner. [('train', 1)] means there is only one workflow and the workflow named 'train' is executed once. The workflow trains the model by 40000 iterations according to the `runner.max_iters`.
cudnn_benchmark = True # Whether use cudnn_benchmark to speed up, which is fast for fixed input size.
optimizer = dict( # Config used to build optimizer, support all the optimizers in PyTorch whose arguments are also the same as those in PyTorch
type='SGD', # Type of optimizers, refer to https://github.com/open-mmlab/mmcv/blob/master/mmcv/runner/optimizer/default_constructor.py#L13 for more details
lr=0.01, # Learning rate of optimizers, see detail usages of the parameters in the documentation of PyTorch
momentum=0.9, # Momentum
weight_decay=0.0005) # Weight decay of SGD
optimizer_config = dict() # Config used to build the optimizer hook, refer to https://github.com/open-mmlab/mmcv/blob/master/mmcv/runner/hooks/optimizer.py#L8 for implementation details.
lr_config = dict(
policy='poly', # The policy of scheduler, also support Step, CosineAnnealing, Cyclic, etc. Refer to details of supported LrUpdater from https://github.com/open-mmlab/mmcv/blob/master/mmcv/runner/hooks/lr_updater.py#L9.
power=0.9, # The power of polynomial decay.
min_lr=0.0001, # The minimum learning rate to stable the training.
checkpoint_config = dict( # Config to set the checkpoint hook, Refer to https://github.com/open-mmlab/mmcv/blob/master/mmcv/runner/hooks/checkpoint.py for implementation.
Sometimes, you may set `_delete_=True` to ignore some of the fields in base configs.
You may refer to [mmcv](https://mmcv.readthedocs.io/en/latest/understand_mmcv/config.html#inherit-from-base-config-with-ignored-fields) for simple illustration.
Some intermediate variables are used in the configs files, like `train_pipeline`/`test_pipeline` in datasets.
It's worth noting that when modifying intermediate variables in the children configs, user need to pass the intermediate variables into corresponding fields again.
For example, we would like to change multi scale strategy to train/test a PSPNet. `train_pipeline`/`test_pipeline` are intermediate variable we would like to modify.