7.4 KiB
Getting Started
This page provides basic tutorials about the usage of OpenSelfSup. For installation instructions, please see INSTALL.md.
Train existing methods
Note: The default learning rate in config files is for 8 GPUs (except for those under configs/linear_classification
that use 1 GPU). If using differnt number GPUs, the total batch size will change in proportion, you have to scale the learning rate following new_lr = old_lr * new_ngpus / old_ngpus
. We recommend to use tools/dist_train.sh
even with 1 gpu, since some methods do not support non-distributed training.
Train with single/multiple GPUs
# checkpoints and logs are saved in the same sub-directory as the config file under `work_dirs/` by default.
bash tools/dist_train.sh ${CONFIG_FILE} ${GPU_NUM} [optional arguments]
An example:
bash tools/dist_train.sh configs/selfsup/odc/r50_v1.py 8
Optional arguments are:
--work_dir ${WORK_DIR}
: Override the default working directory.--resume_from ${CHECKPOINT_FILE}
: Resume from a previous checkpoint file.--pretrained ${PRETRAIN_WEIGHTS}
: Load pretrained weights for the backbone.
Alternatively, if you run OpenSelfSup on a cluster managed with slurm:
SRUN_ARGS="${SRUN_ARGS}" bash tools/srun_train.sh ${PARTITION} ${CONFIG_FILE} ${GPU_NUM} [optional arguments]
An example:
SRUN_ARGS="-w xx.xx.xx.xx" bash tools/srun_train.sh Dummy configs/selfsup/odc/r50_v1.py 8
Launch multiple jobs on a single machine
If you launch multiple jobs on a single machine, e.g., 2 jobs of 4-GPU training on a machine with 8 GPUs, you need to specify different ports (29500 by default) for each job to avoid communication conflict.
If you use dist_train.sh
to launch training jobs:
CUDA_VISIBLE_DEVICES=0,1,2,3 PORT=29500 bash tools/dist_train.sh ${CONFIG_FILE} 4
CUDA_VISIBLE_DEVICES=4,5,6,7 PORT=29501 bash tools/dist_train.sh ${CONFIG_FILE} 4
If you use launch training jobs with slurm:
GPUS_PER_NODE=4 bash tools/srun_train.sh ${PARTITION} ${CONFIG_FILE} 4 --port 29500
GPUS_PER_NODE=4 bash tools/srun_train.sh ${PARTITION} ${CONFIG_FILE} 4 --port 29501
Benchmarks
We provide several standard benchmarks to evaluate representation learning.
VOC07 Linear SVM & Low-shot Linear SVM
bash benchmarks/dist_test_svm.sh ${CONFIG_FILE} ${EPOCH} ${FEAT_LIST} ${GPU_NUM}
Augments:
${FEAT_LIST}
is a string to specify features from layer1 to layer5 to evaluate; e.g., if you want to evaluate layer5 only, thenFEAT_LIST
isfeat5
, if you want to evaluate all features, then thenFEAT_LIST
isfeat1 feat2 feat3 feat4 feat5
(separated by space).$GPU_NUM
is the number of GPUs to extract features.
ImageNet / Places205 Linear Classification
bash benchmarks/dist_test_cls.sh ${CONFIG_FILE} ${EPOCH} ${DATASET} [optional arguments]
Augments:
${DATASET}
in['imagenet', 'places205']
.- Optional arguments include
--resume_from ${CHECKPOINT_FILE}
that resume from a previous checkpoint file.
VOC07+12 / COCO17 Object Detection
-
First, extract backbone weights:
python tools/extract_backbone_weights.py ${CHECKPOINT} --save-path ${WEIGHT_FILE}
Arguments:
CHECKPOINTS
: the checkpoint file of a selfsup method named asepoch_*.pth
.WEIGHT_FILE
: the output backbone weights file, e.g.,odc_v1.pth
.
-
Next, run detection. For more details to setup the environments for detection, please refer here.
conda activate detectron2
cd benchmarks/detection
python convert-pretrain-to-detectron2.py ${WEIGHT_FILE} ${OUTPUT_FILE} # must use .pkl as the output extension.
bash run.sh ${DET_CFG} ${OUTPUT_FILE}
Arguments:
DET_CFG
: the detectron2 config file, usually we useconfigs/pascal_voc_R_50_C4_24k_moco.yaml
.OUTPUT_FILE
: converted backbone weights file, e.g.,odc_v1.pkl
.
Note:
- This benchmark must use 8 GPUs as the default setting from MoCo.
- Please report the mean of 5 trials in your offical paper, according to MoCo.
- DeepCluster that uses Sobel layer is not supported by detectron2.
Publish a model
- Extract the backbone weights as mentioned before. You don't have to extract it again if you've already done it in the benchmark step.
python tools/extract_backbone_weights.py ${CHECKPOINT} --save-path ${WEIGHT_FILE}
- Compute the hash of the weight file and append the hash id to the filename.
python tools/publish_model.py ${WEIGHT_FILE}
How-to
Use a new dataset
-
Write a data source file under
openselfsup/datasets/data_sources/
. You may refer to the existing ones. -
Create new config files for your experiments.
Design your own methods
What you need to do
1. Create a dataset file under `openselfsup/datasets/` (better using existing ones);
2. Create a model file under `openselfsup/models/`. The model typically contains:
i) backbone (required): images to deep features from differet depth of layers.
ii) neck (optional): deep features to compact feature vectors.
iii) head (optional): define loss functions.
iv) memory_bank (optional): define memory banks.
3. Create a config file under `configs/` and setup the configs;
4. Create a hook file under `openselfsup/hooks/` if your method requires additional operations before run, every several iterations, every several epoch, or after run.
You may refer to existing modules under respective folders.
Features may facilitate your implementation
- Decoupled data source and dataset.
Since dataset is correlated to a specific task while data source is general, we decouple data source and dataset in OpenSelfSup.
data = dict(
train=dict(type='ContrastiveDataset',
data_source=dict(type='ImageNet', list_file='xx', root='xx'),
pipeline=train_pipeline),
val=dict(...),
)
- Configure data augmentations in the config file.
The augmentations are the same as torchvision.transforms
. torchvision.transforms.RandomAppy
corresponds to RandomAppliedTrans
. Lighting
and GaussianBlur
is additionally implemented.
train_pipeline = [
dict(type='RandomResizedCrop', size=224),
dict(type='RandomAppliedTrans',
transforms=[
dict(type='GaussianBlur', sigma_min=0.1, sigma_max=2.0, kernel_size=23)],
p=0.5),
dict(type='ToTensor'),
dict(type='Normalize', **img_norm_cfg)
]
- Parameter-wise optimization parameters.
You may specify optimization paramters including lr, momentum and weight_decay for a certain group of paramters in the config file with paramwise_options
. paramwise_options
is a dict whose key is regular expressions and value is options. Options include 6 fields: lr, lr_mult, momentum, momentum_mult, weight_decay, weight_decay_mult.
paramwise_options = {
'(bn|gn)(\d+)?.(weight|bias)': dict(weight_decay_mult=0.1),
'\Ahead.': dict(lr_mult=10, momentum=0)}
optimizer_cfg = dict(type='SGD', lr=0.01, momentum=0.9,
weight_decay=0.0001,
paramwise_options=paramwise_options)
- Configure custom hooks in the config file.
The hooks will be called in order. For hook design, please refer to odc_hook.py as an example.
custom_hooks = [
dict(type='DeepClusterHook', **kwargs1),
dict(type='ODCHook', **kwargs2),
]