support MoCo 8GPU linear_cls

pull/16/head
xieenze 2020-07-19 17:52:47 +08:00
parent 79fdb8ce54
commit ebb4ae307a
3 changed files with 7 additions and 7 deletions

View File

@ -5,8 +5,8 @@ set -x
CFG=$1 # use cfgs under "configs/benchmarks/linear_classification/"
PRETRAIN=$2
GPUS=$3 # in MoCo, GPUS=8
PY_ARGS=${@:4} # --resume_from --deterministic
PY_ARGS=${@:3} # --resume_from --deterministic
GPUS=8 # When changing GPUS, please also change imgs_per_batch in the config file accordingly to ensure the total batch size is 256.
PORT=${PORT:-29500}
if [ "$CFG" == "" ] || [ "$PRETRAIN" == "" ]; then

View File

@ -8,8 +8,8 @@ CFG=$2
PRETRAIN=$3
PY_ARGS=${@:4}
JOB_NAME="openselfsup"
GPUS=1 # in the standard setting, GPUS=1
GPUS_PER_NODE=${GPUS_PER_NODE:-1}
GPUS=8 # When changing GPUS, please also change imgs_per_batch in the config file accordingly to ensure the total batch size is 256.
GPUS_PER_NODE=${GPUS_PER_NODE:-8}
CPUS_PER_TASK=${CPUS_PER_TASK:-5}
SRUN_ARGS=${SRUN_ARGS:-""}

View File

@ -5,7 +5,8 @@ For installation instructions, please see [INSTALL.md](INSTALL.md).
## Train existing methods
**Note**: The default learning rate in config files is for 8 GPUs (except for those under `configs/benchmarks/linear_classification` that use 1 GPU). If using differnt number GPUs, the total batch size will change in proportion, you have to scale the learning rate following `new_lr = old_lr * new_ngpus / old_ngpus`. We recommend to use `tools/dist_train.sh` even with 1 gpu, since some methods do not support non-distributed training.
**Note**: The default learning rate in config files is for 8 GPUs.
If using differnt number GPUs, the total batch size will change in proportion, you have to scale the learning rate following `new_lr = old_lr * new_ngpus / old_ngpus`. We recommend to use `tools/dist_train.sh` even with 1 gpu, since some methods do not support non-distributed training.
### Train with single/multiple GPUs
@ -101,13 +102,12 @@ Arguments:
**Next**, train and test linear classification:
```shell
# train
bash benchmarks/dist_train_linear.sh ${CONFIG_FILE} ${WEIGHT_FILE} ${GPUS} [optional arguments]
bash benchmarks/dist_train_linear.sh ${CONFIG_FILE} ${WEIGHT_FILE} [optional arguments]
# test (unnecessary if have validation in training)
bash tools/dist_test.sh ${CONFIG_FILE} ${GPUS} ${CHECKPOINT}
```
Augments:
- `CONFIG_FILE`: Use config files under "configs/benchmarks/linear_classification/". Note that if you want to test DeepCluster that has a sobel layer before the backbone, you have to use the config file named `*_sobel.py`, e.g., `configs/benchmarks/linear_classification/imagenet/r50_multihead_sobel.py`.
- `GPUS`: `8` for MoCo and `1` for other methods.
- Optional arguments include:
- `--resume_from ${CHECKPOINT_FILE}`: Resume from a previous checkpoint file.
- `--deterministic`: Switch on "deterministic" mode which slows down training but the results are reproducible.