MMSelfSup mainly uses python files as configs. The design of our configuration file system integrates modularity and inheritance, facilitating users to conduct various experiments. All configuration files are placed in the `configs` folder. If you wish to inspect the config file in summary, you may run `python tools/misc/print_config.py` to see the complete config.
<!-- TOC -->
- [Tutorial 0: Learn about Configs](#tutorial-0-learn-about-configs)
- [Config File and Checkpoint Naming Convention](#config-file-and-checkpoint-naming-convention)
- [Algorithm information](#algorithm-information)
- [Module information](#module-information)
- [Training information](#training-information)
- [Data information](#data-information)
- [Config File Name Example](#config-file-name-example)
We follow the below convention to name config files. Contributors are advised to follow the same style. The config file names are divided into four parts: algorithm info, module information, training information and data information. Logically, different parts are concatenated by underscores `'_'`, and words in the same part are concatenated by dashes `'-'`.
-`8xb32-mcrop-2-6-coslr-200e` : `mcrop` is proposed in SwAV named multi-crop,part of pipeline. 2 and 6 means that 2 pipelines will output 2 and 6 crops correspondingly,the crop size is recorded in data information;
-`8xb32-accum16-coslr-200e` : `accum16` means the gradient will accumulate for 16 iterations,then the weights will be updated.
The naming of the weight mainly includes the configuration file name, date and hash value.
```
{config_name}_{date}-{hash}.pth
```
## Config File Structure
There are four kinds of basic component file in the `configs/_base_` folders, namely:
- models
- datasets
- schedules
- runtime
You can easily build your own training config file by inherit some base config files. And the configs that are composed by components from `_base_` are called _primitive_.
For easy understanding, we use MoCo v2 as a example and comment the meaning of each line. For more detaile, please refer to the API documentation.
workers_per_gpu=4, # Worker to pre-fetch data for each single GPU
drop_last=True, # Whether to drop the last batch of data
train=dict(
type=dataset_type, # dataset name
data_source=dict(
type=data_source, # data source name
data_prefix='data/imagenet/train', # Dataset root, when ann_file does not exist, the category information is automatically obtained from the root folder
ann_file='data/imagenet/meta/train.txt', # ann_file existes, the category information is obtained from file
),
num_views=[2], # The number of different views from pipeline
pipelines=[train_pipeline], # The train pipeline
prefetch=prefetch, # The boolean value
))
```
`../_base_/schedules/sgd_coslr-200e_in1k.py` is the base schedule config for MoCo v2.
lr=0.03, # Learning rate of optimizers, see detail usages of the parameters in the documentation of PyTorch
weight_decay=1e-4, # Momentum parameter
momentum=0.9) # Weight decay of SGD
# Config used to build the optimizer hook, refer to https://github.com/open-mmlab/mmcv/blob/master/mmcv/runner/hooks/optimizer.py#L8 for implementation details.
optimizer_config = dict() # this config can set grad_clip, coalesce, bucket_size_mb, etc.
# learning policy
# Learning rate scheduler config used to register LrUpdater hook
lr_config = dict(
policy='CosineAnnealing', # The policy of scheduler, also support Step, Cyclic, etc. Refer to details of supported LrUpdater from https://github.com/open-mmlab/mmcv/blob/master/mmcv/runner/hooks/lr_updater.py#L9.
min_lr=0.) # The minimum lr setting in CosineAnnealing
# runtime settings
runner = dict(
type='EpochBasedRunner', # Type of runner to use (i.e. IterBasedRunner or EpochBasedRunner)
max_epochs=200) # Runner that runs the workflow in total max_epochs. For IterBasedRunner use `max_iters`
dict(type='TextLoggerHook'), # The Tensorboard logger is also supported
# dict(type='TensorboardLoggerHook'),
])
# yapf:enable
# runtime settings
dist_params = dict(backend='nccl') # Parameters to setup distributed training, the port can also be set.
log_level = 'INFO' # The output level of the log.
load_from = None # Runner to load ckpt
resume_from = None # Resume checkpoints from a given path, the training will be resumed from the epoch when the checkpoint's is saved.
workflow = [('train', 1)] # Workflow for runner. [('train', 1)] means there is only one workflow and the workflow named 'train' is executed once.
persistent_workers = True # The boolean type to set persistent_workers in Dataloader. see detail in the documentation of PyTorch
```
## Inherit and Modify Config File
For easy understanding, we recommend contributors to inherit from existing methods.
For all configs under the same folder, it is recommended to have only **one**_primitive_ config. All other configs should inherit from the _primitive_ config. In this way, the maximum of inheritance level is 3.
For example, if your config file is based on MoCo v2 with some other modification, you can first inherit the basic MoCo v2 structure, dataset and other training setting by specifying `_base_ ='./mocov2_resnet50_8xb32-coslr-200e_in1k.py.py'` (The path relative to your config file), and then modify the necessary parameters in the config file. A more specific example, now we want to use almost all configs in `configs/selfsup/mocov2/mocov2_resnet50_8xb32-coslr-200e_in1k.py.py`, but change the number of training epochs from 200 to 800, modify when to decay the learning rate, and modify the dataset path, you can create a new config file `configs/selfsup/mocov2/mocov2_resnet50_8xb32-coslr-800e_in1k.py.py` with content as below:
Some intermediate variables are used in the configuration file. The intermediate variables make the configuration file clearer and easier to modify.
For example, `data_source`, `dataset_type`, `train_pipeline`, `prefetch` are the intermediate variables of the data. We first need to define them and then pass them to `data`.
Sometimes, you need to set `_delete_=True` to ignore some domain content in the basic configuration file. You can refer to [mmcv](https://mmcv.readthedocs.io/en/latest/understand_mmcv/config.html#inherit-from-base-config-with-ignored-fields) for more instructions.
The following is an example. If you want to use `MoCoV2Neck` in simclr setting, just using inheritance and directly modify it will report `get unexcepected keyword 'num_layers'` error, because the `'num_layers'` field of the basic config in `model.neck` domain information is reserved, and you need to add `_delete_=True` to ignore the content of `model.neck` related fields in the basic configuration file:
Sometimes, you may refer to some fields in the `_base_` config, so as to avoid duplication of definitions. You can refer to [mmcv](https://mmcv.readthedocs.io/en/latest/understand_mmcv/config.html#reference-variables-from-base) for some more instructions.
The following is an example of using auto augment in the training data preprocessing pipeline, refer to `configs/selfsup/odc/odc_resnet50_8xb64-steplr-440e_in1k.py`. When defining `num_classes`, just add the definition file name of auto augment to `_base_`, and then use `{{_base_.num_classes}}` to reference the variables:
When users use the script "tools/train.py" or "tools/test.py" to submit tasks or use some other tools, they can directly modify the content of the configuration file used by specifying the `--cfg-options` parameter.
- Update config keys of dict chains.
The config options can be specified following the order of the dict keys in the original config.
For example, `--cfg-options model.backbone.norm_eval=False` changes the all BN modules in model backbones to `train` mode.
- Update keys inside a list of configs.
Some config dicts are composed as a list in your config. For example, the training pipeline `data.train.pipeline` is normally a list
e.g. `[dict(type='LoadImageFromFile'), dict(type='TopDownRandomFlip', flip_prob=0.5), ...]`. If you want to change `'flip_prob=0.5'` to `'flip_prob=0.0'` in the pipeline,
you may specify `--cfg-options data.train.pipeline.1.flip_prob=0.0`.
- Update values of list/tuples.
If the value to be updated is a list or a tuple. For example, the config file normally sets `workflow=[('train', 1)]`. If you want to
You may use other MM-codebase to complete your project and create new classes of datasets, models, data enhancements, etc. in the project. In order to streamline the code, you can use MM-codebase as a third-party library, you just need to keep your own extra code and import your own custom module in the configuration files. For examples, you may refer to [OpenMMLab Algorithm Competition Project](https://github.com/zhangrui-wolf/openmmlab-competition-2021) .