mmselfsup/tools/dist_train.sh

20 lines
442 B
Bash
Raw Permalink Normal View History

2021-12-15 19:11:37 +08:00
#!/usr/bin/env bash
CONFIG=$1
GPUS=$2
Bump version to v0.8.0 (#269) * [Fix]: Fix mmcls upgrade bug (#235) * [Feature]: Add multi machine dist_train (#232) * [Feature]: Add multi machine dist_train * [Fix]: Change bash to sh * [Fix]: Fix missing sh suffix * [Refactor]: Change bash to sh * [Refactor] Add unit test (#234) * [Refactor] add unit test * update workflow * update * [Fix] fix lint * update test * refactor moco and densecl unit test * fix lint * add unit test * update unit test * remove modification * [Feature]: Add MAE metafile (#238) * [Feature]: Add MAE metafile * [Fix]: Fix lint * [Fix]: Change LARS to AdamW in the metafile of MAE * [Fix] fix codecov bug (#241) * [Fix] fix codecov bug * update comment * [Refactor] Using MMCls backbones (#233) * [Refactor] using backbones from MMCls * [Refactor] modify the unit test * [Fix] modify default setting of out_indices * [Docs] fix lint * [Refactor] modify super init * [Refactore] remove res_layer.py * using mmcv PatchEmbed * [Fix]: Fix outdated problem (#249) * [Fix]: Fix outdated problem * [Fix]: Update MoCov3 bibtex * [Fix]: Use abs path in README * [Fix]: Reformat MAE bibtex * [Fix]: Reformat MoCov3 bibtex * [Feature] Resume from the latest checkpoint automatically. (#245) * [Feature] Resume from the latest checkpoint automatically. * fix windows path problem * fix lint * add code reference * [Docs] add docstring for ResNet and ResNeXt (#252) * [Feature] support KNN benchmark (#243) * [Feature] support KNN benchmark * [Fix] add docstring and multi-machine testing * [Fix] fix lint * [Fix] change args format and check init_cfg * [Docs] add benchmark tutorial * [Docs] add benchmark results * [Feature]: SimMIM supported (#239) * [Feature]: SimMIM Pretrain * [Feature]: Add mix precision and 16x128 config * [Fix]: Fix config import bug * [Fix]: Fix config bug * [Feature]: Simim Finetune * [Fix]: Log every 100 * [Fix]: Fix eval problem * [Feature]: Add docstring for simmim * [Refactor]: Merge layer wise lr decay to Default constructor * [Fix]:Fix simmim evaluation bug * [Fix]: Change model to be compatible to latest version of mmcls * [Fix]: Fix lint * [Fix]: Rewrite forward_train for classification cls * [Feature]: Add UT * [Fix]: Fix lint * [Feature]: Add 32 gpus training for simmim ft * [Fix]: Rename mmcls classifier wrapper * [Fix]: Add docstring to SimMIMNeck * [Feature]: Generate docstring for the forward function of simmim encoder * [Fix]: Rewrite the class docstring for constructor * [Fix]: Fix lint * [Fix]: Fix UT * [Fix]: Reformat config * [Fix]: Add img resolution * [Feature]: Add readme and metafile * [Fix]: Fix typo in README.md * [Fix]: Change BlackMaskGen to BlockwiseMaskGenerator * [Fix]: Change the name of SwinForSimMIM * [Fix]: Delete irrelevant files * [Feature]: Create extra transformerfinetuneconstructor * [Fix]: Fix lint * [Fix]: Update SimMIM README * [Fix]: Change SimMIMPretrainHead to SimMIMHead * [Fix]: Fix the docstring of ft constructor * [Fix]: Fix UT * [Fix]: Recover deletion Co-authored-by: Your <you@example.com> * [Fix] add seed to distributed sampler (#250) * [Fix] add seed to distributed sampler * fix lint * [Feature] Add ImageNet21k (#225) * solve memory leak by limited implementation * fix lint problem Co-authored-by: liming <liming.ai@bytedance.com> * [Refactor] change args format to '--a-b' (#253) * [Refactor] change args format to `--a-b` * modify tsne script * modify 'sh' files * modify getting_started.md * modify getting_started.md * [Fix] fix 'mkdir' error in prepare_voc07_cls.sh (#261) * [Fix] fix positional parameter error (#260) * [Fix] fix command errors in benchmarks tutorial (#263) * [Docs] add brief installation steps in README.md (#265) * [Docs] add colab tutorial (#247) * [Docs] add colab tutorial * fix lint * modify the colab tutorial, using API to train the model * modify the description * remove # * modify the command * [Docs] translate 6_benchmarks.md into Chinese (#262) * [Docs] translate 6_benchmarks.md into Chinese * Update 6_benchmarks.md change 基准 to 基准评测 * Update 6_benchmarks.md (1) Add Chinese translation of ‘1 folder for ImageNet nearest-neighbor classification task’ (2) 数据预准备 -> 数据准备 * [Docs] remove install scripts in README (#267) * [Docs] Update version information in dev branch (#268) * update version to v0.8.0 * fix lint * [Fix]: Install the latest mmcls * [Fix]: Add SimMIM in RAEDME Co-authored-by: Yuan Liu <30762564+YuanLiuuuuuu@users.noreply.github.com> Co-authored-by: Jiahao Xie <52497952+Jiahao000@users.noreply.github.com> Co-authored-by: Your <you@example.com> Co-authored-by: Ming Li <73068772+mitming@users.noreply.github.com> Co-authored-by: liming <liming.ai@bytedance.com> Co-authored-by: RenQin <45731309+soonera@users.noreply.github.com> Co-authored-by: YuanLiuuuuuu <3463423099@qq.com>
2022-03-31 18:47:54 +08:00
NNODES=${NNODES:-1}
NODE_RANK=${NODE_RANK:-0}
2021-12-15 19:11:37 +08:00
PORT=${PORT:-29500}
Bump version to v0.8.0 (#269) * [Fix]: Fix mmcls upgrade bug (#235) * [Feature]: Add multi machine dist_train (#232) * [Feature]: Add multi machine dist_train * [Fix]: Change bash to sh * [Fix]: Fix missing sh suffix * [Refactor]: Change bash to sh * [Refactor] Add unit test (#234) * [Refactor] add unit test * update workflow * update * [Fix] fix lint * update test * refactor moco and densecl unit test * fix lint * add unit test * update unit test * remove modification * [Feature]: Add MAE metafile (#238) * [Feature]: Add MAE metafile * [Fix]: Fix lint * [Fix]: Change LARS to AdamW in the metafile of MAE * [Fix] fix codecov bug (#241) * [Fix] fix codecov bug * update comment * [Refactor] Using MMCls backbones (#233) * [Refactor] using backbones from MMCls * [Refactor] modify the unit test * [Fix] modify default setting of out_indices * [Docs] fix lint * [Refactor] modify super init * [Refactore] remove res_layer.py * using mmcv PatchEmbed * [Fix]: Fix outdated problem (#249) * [Fix]: Fix outdated problem * [Fix]: Update MoCov3 bibtex * [Fix]: Use abs path in README * [Fix]: Reformat MAE bibtex * [Fix]: Reformat MoCov3 bibtex * [Feature] Resume from the latest checkpoint automatically. (#245) * [Feature] Resume from the latest checkpoint automatically. * fix windows path problem * fix lint * add code reference * [Docs] add docstring for ResNet and ResNeXt (#252) * [Feature] support KNN benchmark (#243) * [Feature] support KNN benchmark * [Fix] add docstring and multi-machine testing * [Fix] fix lint * [Fix] change args format and check init_cfg * [Docs] add benchmark tutorial * [Docs] add benchmark results * [Feature]: SimMIM supported (#239) * [Feature]: SimMIM Pretrain * [Feature]: Add mix precision and 16x128 config * [Fix]: Fix config import bug * [Fix]: Fix config bug * [Feature]: Simim Finetune * [Fix]: Log every 100 * [Fix]: Fix eval problem * [Feature]: Add docstring for simmim * [Refactor]: Merge layer wise lr decay to Default constructor * [Fix]:Fix simmim evaluation bug * [Fix]: Change model to be compatible to latest version of mmcls * [Fix]: Fix lint * [Fix]: Rewrite forward_train for classification cls * [Feature]: Add UT * [Fix]: Fix lint * [Feature]: Add 32 gpus training for simmim ft * [Fix]: Rename mmcls classifier wrapper * [Fix]: Add docstring to SimMIMNeck * [Feature]: Generate docstring for the forward function of simmim encoder * [Fix]: Rewrite the class docstring for constructor * [Fix]: Fix lint * [Fix]: Fix UT * [Fix]: Reformat config * [Fix]: Add img resolution * [Feature]: Add readme and metafile * [Fix]: Fix typo in README.md * [Fix]: Change BlackMaskGen to BlockwiseMaskGenerator * [Fix]: Change the name of SwinForSimMIM * [Fix]: Delete irrelevant files * [Feature]: Create extra transformerfinetuneconstructor * [Fix]: Fix lint * [Fix]: Update SimMIM README * [Fix]: Change SimMIMPretrainHead to SimMIMHead * [Fix]: Fix the docstring of ft constructor * [Fix]: Fix UT * [Fix]: Recover deletion Co-authored-by: Your <you@example.com> * [Fix] add seed to distributed sampler (#250) * [Fix] add seed to distributed sampler * fix lint * [Feature] Add ImageNet21k (#225) * solve memory leak by limited implementation * fix lint problem Co-authored-by: liming <liming.ai@bytedance.com> * [Refactor] change args format to '--a-b' (#253) * [Refactor] change args format to `--a-b` * modify tsne script * modify 'sh' files * modify getting_started.md * modify getting_started.md * [Fix] fix 'mkdir' error in prepare_voc07_cls.sh (#261) * [Fix] fix positional parameter error (#260) * [Fix] fix command errors in benchmarks tutorial (#263) * [Docs] add brief installation steps in README.md (#265) * [Docs] add colab tutorial (#247) * [Docs] add colab tutorial * fix lint * modify the colab tutorial, using API to train the model * modify the description * remove # * modify the command * [Docs] translate 6_benchmarks.md into Chinese (#262) * [Docs] translate 6_benchmarks.md into Chinese * Update 6_benchmarks.md change 基准 to 基准评测 * Update 6_benchmarks.md (1) Add Chinese translation of ‘1 folder for ImageNet nearest-neighbor classification task’ (2) 数据预准备 -> 数据准备 * [Docs] remove install scripts in README (#267) * [Docs] Update version information in dev branch (#268) * update version to v0.8.0 * fix lint * [Fix]: Install the latest mmcls * [Fix]: Add SimMIM in RAEDME Co-authored-by: Yuan Liu <30762564+YuanLiuuuuuu@users.noreply.github.com> Co-authored-by: Jiahao Xie <52497952+Jiahao000@users.noreply.github.com> Co-authored-by: Your <you@example.com> Co-authored-by: Ming Li <73068772+mitming@users.noreply.github.com> Co-authored-by: liming <liming.ai@bytedance.com> Co-authored-by: RenQin <45731309+soonera@users.noreply.github.com> Co-authored-by: YuanLiuuuuuu <3463423099@qq.com>
2022-03-31 18:47:54 +08:00
MASTER_ADDR=${MASTER_ADDR:-"127.0.0.1"}
2021-12-15 19:11:37 +08:00
PYTHONPATH="$(dirname $0)/..":$PYTHONPATH \
Bump version to v0.8.0 (#269) * [Fix]: Fix mmcls upgrade bug (#235) * [Feature]: Add multi machine dist_train (#232) * [Feature]: Add multi machine dist_train * [Fix]: Change bash to sh * [Fix]: Fix missing sh suffix * [Refactor]: Change bash to sh * [Refactor] Add unit test (#234) * [Refactor] add unit test * update workflow * update * [Fix] fix lint * update test * refactor moco and densecl unit test * fix lint * add unit test * update unit test * remove modification * [Feature]: Add MAE metafile (#238) * [Feature]: Add MAE metafile * [Fix]: Fix lint * [Fix]: Change LARS to AdamW in the metafile of MAE * [Fix] fix codecov bug (#241) * [Fix] fix codecov bug * update comment * [Refactor] Using MMCls backbones (#233) * [Refactor] using backbones from MMCls * [Refactor] modify the unit test * [Fix] modify default setting of out_indices * [Docs] fix lint * [Refactor] modify super init * [Refactore] remove res_layer.py * using mmcv PatchEmbed * [Fix]: Fix outdated problem (#249) * [Fix]: Fix outdated problem * [Fix]: Update MoCov3 bibtex * [Fix]: Use abs path in README * [Fix]: Reformat MAE bibtex * [Fix]: Reformat MoCov3 bibtex * [Feature] Resume from the latest checkpoint automatically. (#245) * [Feature] Resume from the latest checkpoint automatically. * fix windows path problem * fix lint * add code reference * [Docs] add docstring for ResNet and ResNeXt (#252) * [Feature] support KNN benchmark (#243) * [Feature] support KNN benchmark * [Fix] add docstring and multi-machine testing * [Fix] fix lint * [Fix] change args format and check init_cfg * [Docs] add benchmark tutorial * [Docs] add benchmark results * [Feature]: SimMIM supported (#239) * [Feature]: SimMIM Pretrain * [Feature]: Add mix precision and 16x128 config * [Fix]: Fix config import bug * [Fix]: Fix config bug * [Feature]: Simim Finetune * [Fix]: Log every 100 * [Fix]: Fix eval problem * [Feature]: Add docstring for simmim * [Refactor]: Merge layer wise lr decay to Default constructor * [Fix]:Fix simmim evaluation bug * [Fix]: Change model to be compatible to latest version of mmcls * [Fix]: Fix lint * [Fix]: Rewrite forward_train for classification cls * [Feature]: Add UT * [Fix]: Fix lint * [Feature]: Add 32 gpus training for simmim ft * [Fix]: Rename mmcls classifier wrapper * [Fix]: Add docstring to SimMIMNeck * [Feature]: Generate docstring for the forward function of simmim encoder * [Fix]: Rewrite the class docstring for constructor * [Fix]: Fix lint * [Fix]: Fix UT * [Fix]: Reformat config * [Fix]: Add img resolution * [Feature]: Add readme and metafile * [Fix]: Fix typo in README.md * [Fix]: Change BlackMaskGen to BlockwiseMaskGenerator * [Fix]: Change the name of SwinForSimMIM * [Fix]: Delete irrelevant files * [Feature]: Create extra transformerfinetuneconstructor * [Fix]: Fix lint * [Fix]: Update SimMIM README * [Fix]: Change SimMIMPretrainHead to SimMIMHead * [Fix]: Fix the docstring of ft constructor * [Fix]: Fix UT * [Fix]: Recover deletion Co-authored-by: Your <you@example.com> * [Fix] add seed to distributed sampler (#250) * [Fix] add seed to distributed sampler * fix lint * [Feature] Add ImageNet21k (#225) * solve memory leak by limited implementation * fix lint problem Co-authored-by: liming <liming.ai@bytedance.com> * [Refactor] change args format to '--a-b' (#253) * [Refactor] change args format to `--a-b` * modify tsne script * modify 'sh' files * modify getting_started.md * modify getting_started.md * [Fix] fix 'mkdir' error in prepare_voc07_cls.sh (#261) * [Fix] fix positional parameter error (#260) * [Fix] fix command errors in benchmarks tutorial (#263) * [Docs] add brief installation steps in README.md (#265) * [Docs] add colab tutorial (#247) * [Docs] add colab tutorial * fix lint * modify the colab tutorial, using API to train the model * modify the description * remove # * modify the command * [Docs] translate 6_benchmarks.md into Chinese (#262) * [Docs] translate 6_benchmarks.md into Chinese * Update 6_benchmarks.md change 基准 to 基准评测 * Update 6_benchmarks.md (1) Add Chinese translation of ‘1 folder for ImageNet nearest-neighbor classification task’ (2) 数据预准备 -> 数据准备 * [Docs] remove install scripts in README (#267) * [Docs] Update version information in dev branch (#268) * update version to v0.8.0 * fix lint * [Fix]: Install the latest mmcls * [Fix]: Add SimMIM in RAEDME Co-authored-by: Yuan Liu <30762564+YuanLiuuuuuu@users.noreply.github.com> Co-authored-by: Jiahao Xie <52497952+Jiahao000@users.noreply.github.com> Co-authored-by: Your <you@example.com> Co-authored-by: Ming Li <73068772+mitming@users.noreply.github.com> Co-authored-by: liming <liming.ai@bytedance.com> Co-authored-by: RenQin <45731309+soonera@users.noreply.github.com> Co-authored-by: YuanLiuuuuuu <3463423099@qq.com>
2022-03-31 18:47:54 +08:00
python -m torch.distributed.launch \
--nnodes=$NNODES \
--node_rank=$NODE_RANK \
--master_addr=$MASTER_ADDR \
--nproc_per_node=$GPUS \
--master_port=$PORT \
$(dirname "$0")/train.py \
$CONFIG \
--launcher pytorch ${@:3}